Most of the data publishing methods have not considered sensitivity protection,and hence the adversary can disclose privacy by sensitivity attack.Faced with this problem,this paper presents a medical data publishing m...Most of the data publishing methods have not considered sensitivity protection,and hence the adversary can disclose privacy by sensitivity attack.Faced with this problem,this paper presents a medical data publishing method based on sensitivity determination.To protect the sensitivity,the sensitivity of disease information is determined by semantics.To seek the trade-off between information utility and privacy security,the new method focusses on the protection of sensitive values with high sensitivity and assigns the highly sensitive disease information to groups as evenly as possible.The experiments are conducted on two real-world datasets,of which the records include various attributes of patients.To measure sensitivity protection,the authors define a metric,which can evaluate the degree of sensitivity disclosure.Besides,additional information loss and discernability metrics are used to measure the availability of released tables.The experimental results indicate that the new method can provide better privacy than the traditional one while the information utility is guaranteed.Besides value protection,the proposed method can provide sensitivity protection and available releasing for medical data.展开更多
Many data sharing applications require that publishing data should protect sensitive information pertaining to individuals, such as diseases of patients, the credit rating of a customer, and the salary of an employee....Many data sharing applications require that publishing data should protect sensitive information pertaining to individuals, such as diseases of patients, the credit rating of a customer, and the salary of an employee. Meanwhile, certain information is required to be published. In this paper, we consider data-publishing applications where the publisher specifies both sensitive information and shared information. An adversary can infer the real value of a sensitive entry with a high confidence by using publishing data. The goal is to protect sensitive information in the presence of data inference using derived association rules on publishing data. We formulate the inference attack framework, and develop complexity results. We show that computing a safe partial table is an NP-hard problem. We classify the general problem into subcases based on the requirements of publishing information, and propose algorithms for finding a safe partial table to publish. We have conducted an empirical study to evaluate these algorithms on real data. The test results show that the proposed algorithms can produce approximate maximal published data and improve the performance of existing algorithms.展开更多
This paper conducts a survey on iterative learning control(ILC) with incomplete information and associated control system design, which is a frontier of the ILC field.The incomplete information, including passive and ...This paper conducts a survey on iterative learning control(ILC) with incomplete information and associated control system design, which is a frontier of the ILC field.The incomplete information, including passive and active types,can cause data loss or fragment due to various factors. Passive incomplete information refers to incomplete data and information caused by practical system limitations during data collection,storage, transmission, and processing, such as data dropouts,delays, disordering, and limited transmission bandwidth. Active incomplete information refers to incomplete data and information caused by man-made reduction of data quantity and quality on the premise that the given objective is satisfied, such as sampling and quantization. This survey emphasizes two aspects:the first one is how to guarantee good learning performance and tracking performance with passive incomplete data, and the second is how to balance the control performance index and data demand by active means. The promising research directions along this topic are also addressed, where data robustness is highly emphasized. This survey is expected to improve understanding of the restrictive relationship and trade-off between incomplete data and tracking performance, quantitatively, and promote further developments of ILC theory.展开更多
The basic function of the Internet is to delivery data(what) to serve the needs of all applications. IP names the attachment points(where) to facilitate ubiquitous interconnectivity as the current way to deliver data....The basic function of the Internet is to delivery data(what) to serve the needs of all applications. IP names the attachment points(where) to facilitate ubiquitous interconnectivity as the current way to deliver data. The fundamental mismatch between data delivery and naming attachment points leads to a lot of challenges, e.g., mapping from data name to IP address, handling dynamics of underlying topology, scaling up the data distribution, and securing communication, etc. Informationcentric networking(ICN) is proposed to shift the focus of communication paradigm from where to what, by making the named data the first-class citizen in the network, The basic consensus of ICN is to name the data independent from its container(space dimension) and session(time dimension), which breaks the limitation of point-to-point IP semantic. It scales up data distribution by utilizing available resources, and facilitates communication to fit diverse connectivity and heterogeneous networks. However, there are only a few consensuses on the detailed design of ICN, and quite a few different ICN architectures are proposed. This paper reveals the rationales of ICN from the perspective of the Internet evolution, surveys different design choices, and discusses on two debatable topics in ICN, i.e.,self-certifying versus hierarchical names, and edge versus pervasive caching. We hope this survey helps clarify some mis-understandings on ICN and achieve more consensuses.展开更多
如何在发布涉及个人隐私的数据时保证敏感信息不泄露,同时又能最大程度地提高发布数据的效用,是隐私保护中面临的重大挑战。近年来国内外学者对数据发布中的隐私保护(privacy-preserving data publishing,PPDP)进行了大量研究,适时地对...如何在发布涉及个人隐私的数据时保证敏感信息不泄露,同时又能最大程度地提高发布数据的效用,是隐私保护中面临的重大挑战。近年来国内外学者对数据发布中的隐私保护(privacy-preserving data publishing,PPDP)进行了大量研究,适时地对研究成果进行总结,能够明确研究方向。对数据发布领域的隐私保护成果进行了总结,介绍了常用的隐私保护模型和技术、隐私度量标准和算法,重点阐述了PPDP在不同场景中的应用,指出了PPDP可能的研究课题和应用前景。展开更多
基金supported by the National Natural Science Foundation of China(No.62062016)Doctoral research start‐up fund of Guangxi Normal University(RZ1900006676)Guangxi project of improving Middleaged/Young teachers'ability(No.2020KY020323)。
文摘Most of the data publishing methods have not considered sensitivity protection,and hence the adversary can disclose privacy by sensitivity attack.Faced with this problem,this paper presents a medical data publishing method based on sensitivity determination.To protect the sensitivity,the sensitivity of disease information is determined by semantics.To seek the trade-off between information utility and privacy security,the new method focusses on the protection of sensitive values with high sensitivity and assigns the highly sensitive disease information to groups as evenly as possible.The experiments are conducted on two real-world datasets,of which the records include various attributes of patients.To measure sensitivity protection,the authors define a metric,which can evaluate the degree of sensitivity disclosure.Besides,additional information loss and discernability metrics are used to measure the availability of released tables.The experimental results indicate that the new method can provide better privacy than the traditional one while the information utility is guaranteed.Besides value protection,the proposed method can provide sensitivity protection and available releasing for medical data.
基金Supported by the Program for New Century Excellent Talents in Universities (Grant No. NCET-06-0290)the National Natural Science Foundation of China (Grant Nos. 60828004, 60503036)the Fok Ying Tong Education Foundation Award (Grant No. 104027)
文摘Many data sharing applications require that publishing data should protect sensitive information pertaining to individuals, such as diseases of patients, the credit rating of a customer, and the salary of an employee. Meanwhile, certain information is required to be published. In this paper, we consider data-publishing applications where the publisher specifies both sensitive information and shared information. An adversary can infer the real value of a sensitive entry with a high confidence by using publishing data. The goal is to protect sensitive information in the presence of data inference using derived association rules on publishing data. We formulate the inference attack framework, and develop complexity results. We show that computing a safe partial table is an NP-hard problem. We classify the general problem into subcases based on the requirements of publishing information, and propose algorithms for finding a safe partial table to publish. We have conducted an empirical study to evaluate these algorithms on real data. The test results show that the proposed algorithms can produce approximate maximal published data and improve the performance of existing algorithms.
基金supported by the National Natural Science Foundation of China(61673045)Beijing Natural Science Foundation(4152040)
文摘This paper conducts a survey on iterative learning control(ILC) with incomplete information and associated control system design, which is a frontier of the ILC field.The incomplete information, including passive and active types,can cause data loss or fragment due to various factors. Passive incomplete information refers to incomplete data and information caused by practical system limitations during data collection,storage, transmission, and processing, such as data dropouts,delays, disordering, and limited transmission bandwidth. Active incomplete information refers to incomplete data and information caused by man-made reduction of data quantity and quality on the premise that the given objective is satisfied, such as sampling and quantization. This survey emphasizes two aspects:the first one is how to guarantee good learning performance and tracking performance with passive incomplete data, and the second is how to balance the control performance index and data demand by active means. The promising research directions along this topic are also addressed, where data robustness is highly emphasized. This survey is expected to improve understanding of the restrictive relationship and trade-off between incomplete data and tracking performance, quantitatively, and promote further developments of ILC theory.
基金supported by the National High-tech R&D Program("863"Program)of China(No.2013AA013505)the National Science Foundation of China(No.61472213)State Scholarship Fund from China Scholarship Council(No.201406210270)
文摘The basic function of the Internet is to delivery data(what) to serve the needs of all applications. IP names the attachment points(where) to facilitate ubiquitous interconnectivity as the current way to deliver data. The fundamental mismatch between data delivery and naming attachment points leads to a lot of challenges, e.g., mapping from data name to IP address, handling dynamics of underlying topology, scaling up the data distribution, and securing communication, etc. Informationcentric networking(ICN) is proposed to shift the focus of communication paradigm from where to what, by making the named data the first-class citizen in the network, The basic consensus of ICN is to name the data independent from its container(space dimension) and session(time dimension), which breaks the limitation of point-to-point IP semantic. It scales up data distribution by utilizing available resources, and facilitates communication to fit diverse connectivity and heterogeneous networks. However, there are only a few consensuses on the detailed design of ICN, and quite a few different ICN architectures are proposed. This paper reveals the rationales of ICN from the perspective of the Internet evolution, surveys different design choices, and discusses on two debatable topics in ICN, i.e.,self-certifying versus hierarchical names, and edge versus pervasive caching. We hope this survey helps clarify some mis-understandings on ICN and achieve more consensuses.
文摘如何在发布涉及个人隐私的数据时保证敏感信息不泄露,同时又能最大程度地提高发布数据的效用,是隐私保护中面临的重大挑战。近年来国内外学者对数据发布中的隐私保护(privacy-preserving data publishing,PPDP)进行了大量研究,适时地对研究成果进行总结,能够明确研究方向。对数据发布领域的隐私保护成果进行了总结,介绍了常用的隐私保护模型和技术、隐私度量标准和算法,重点阐述了PPDP在不同场景中的应用,指出了PPDP可能的研究课题和应用前景。