期刊文献+
共找到334,496篇文章
< 1 2 250 >
每页显示 20 50 100
A novel method for clustering cellular data to improve classification
1
作者 Diek W.Wheeler Giorgio A.Ascoli 《Neural Regeneration Research》 SCIE CAS 2025年第9期2697-2705,共9页
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse... Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons. 展开更多
关键词 cellular data clustering dendrogram data classification Levene's one-tailed statistical test unsupervised hierarchical clustering
下载PDF
Ending Privacy’s Gremlin: Stopping the Data-Broker Loophole to the Fourth Amendment’s Search Warrant Requirement
2
作者 Samantha B. Larkin Shakour Abuzneid 《Journal of Information Security》 2024年第4期589-611,共23页
Advances in technology require upgrades in the law. One such area involves data brokers, which have thus far gone unregulated. Data brokers use artificial intelligence to aggregate information into data profiles about... Advances in technology require upgrades in the law. One such area involves data brokers, which have thus far gone unregulated. Data brokers use artificial intelligence to aggregate information into data profiles about individual Americans derived from consumer use of the internet and connected devices. Data profiles are then sold for profit. Government investigators use a legal loophole to purchase this data instead of obtaining a search warrant, which the Fourth Amendment would otherwise require. Consumers have lacked a reasonable means to fight or correct the information data brokers collect. Americans may not even be aware of the risks of data aggregation, which upends the test of reasonable expectations used in a search warrant analysis. Data aggregation should be controlled and regulated, which is the direction some privacy laws take. Legislatures must step forward to safeguard against shadowy data-profiling practices, whether abroad or at home. In the meantime, courts can modify their search warrant analysis by including data privacy principles. 展开更多
关键词 Access Control Access Rights Artificial Intelligence Consumer Behavior Consumer Protection Criminal Law data Brokers data Handling data Privacy data Processing data Profiling Digital Forensics
下载PDF
Synthetic data as an investigative tool in hypertension and renal diseases research
3
作者 Aleena Jamal Som Singh Fawad Qureshi 《World Journal of Methodology》 2025年第1期9-13,共5页
There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful... There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research. 展开更多
关键词 Synthetic data Artificial intelligence NEPHROLOGY Blood pressure RESEARCH EDITORIAL
下载PDF
Data Component:An Innovative Framework for Information Value Metrics in the Digital Economy
4
作者 Tao Xiaoming Wang Yu +5 位作者 Peng Jieyang Zhao Yuelin Wang Yue Wang Youzheng Hu Chengsheng Lu Zhipeng 《China Communications》 SCIE CSCD 2024年第5期17-35,共19页
The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive st... The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive structure for measuring the worth of data elements,hindering effective navigation of the changing digital environment.This paper aims to fill this research gap by introducing the innovative concept of“data components.”It proposes a graphtheoretic representation model that presents a clear mathematical definition and demonstrates the superiority of data components over traditional processing methods.Additionally,the paper introduces an information measurement model that provides a way to calculate the information entropy of data components and establish their increased informational value.The paper also assesses the value of information,suggesting a pricing mechanism based on its significance.In conclusion,this paper establishes a robust framework for understanding and quantifying the value of implicit information in data,laying the groundwork for future research and practical applications. 展开更多
关键词 data component data element data governance data science information theory
下载PDF
Traffic Flow Prediction with Heterogeneous Spatiotemporal Data Based on a Hybrid Deep Learning Model Using Attention-Mechanism
5
作者 Jing-Doo Wang Chayadi Oktomy Noto Susanto 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第8期1711-1728,共18页
A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow acc... A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow accurately.However,accurately predicting traffic flow at the individual road level is extremely difficult due to the complex interplay of spatial and temporal factors.This paper proposes a technique for predicting short-term traffic flow data using an architecture that utilizes convolutional bidirectional long short-term memory(Conv-BiLSTM)with attention mechanisms.Prior studies neglected to include data pertaining to factors such as holidays,weather conditions,and vehicle types,which are interconnected and significantly impact the accuracy of forecast outcomes.In addition,this research incorporates recurring monthly periodic pattern data that significantly enhances the accuracy of forecast outcomes.The experimental findings demonstrate a performance improvement of 21.68%when incorporating the vehicle type feature. 展开更多
关键词 Traffic flow prediction sptiotemporal data heterogeneous data Conv-BiLSTM data-CENTRIC intra-data
下载PDF
Research on Data Theory of Value
6
作者 Li Haijian Zhao Li 《China Economist》 2024年第3期21-38,共18页
This paper explores the data theory of value along the line of reasoning epochal characteristics of data-theoretical innovation-paradigmatic transformation and,through a comparison of hard and soft factors and observa... This paper explores the data theory of value along the line of reasoning epochal characteristics of data-theoretical innovation-paradigmatic transformation and,through a comparison of hard and soft factors and observation of data peculiar features,it draws the conclusion that data have the epochal characteristics of non-competitiveness and non-exclusivity,decreasing marginal cost and increasing marginal return,non-physical and intangible form,and non-finiteness and non-scarcity.It is the epochal characteristics of data that undermine the traditional theory of value and innovate the“production-exchange”theory,including data value generation,data value realization,data value rights determination and data value pricing.From the perspective of data value generation,the levels of data quality,processing,use and connectivity,data application scenarios and data openness will influence data value.From the perspective of data value realization,data,as independent factors of production,show value creation effect,create a value multiplier effect by empowering other factors of production,and substitute other factors of production to create a zero-price effect.From the perspective of data value rights determination,based on the theory of property,the tragedy of the private outweighs the comedy of the private with respect to data,and based on the theory of sharing economy,the comedy of the commons outweighs the tragedy of the commons with respect to data.From the perspective of data pricing,standardized data products can be priced according to the physical product attributes,and non-standardized data products can be priced according to the virtual product attributes.Based on the epochal characteristics of data and theoretical innovation,the“production-exchange”paradigm has undergone a transformation from“using tangible factors to produce tangible products and exchanging tangible products for tangible products”to“using intangible factors to produce tangible products and exchanging intangible products for tangible products”and ultimately to“using intangible factors to produce intangible products and exchanging intangible products for intangible products”. 展开更多
关键词 data theory of value data value generation data value rights determination data value pricing
下载PDF
A Review of the Status and Development Strategies of Computer Science and Technology Under the Background of Big Data
7
作者 Junlin Zhang 《Journal of Electronic Research and Application》 2024年第2期49-53,共5页
This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technol... This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technology,focusing on analyzing the current application status of computer science and technology in big data,including data storage,data processing,and data analysis.Then,it proposes development strategies for big data processing.Computer science and technology play a vital role in big data processing by providing strong technical support. 展开更多
关键词 Big data Computer science and technology data storage data processing data visualization
下载PDF
Data complexity-based batch sanitization method against poison in distributed learning
8
作者 Silv Wang Kai Fan +2 位作者 Kuan Zhang Hui Li Yintang Yang 《Digital Communications and Networks》 SCIE CSCD 2024年第2期416-428,共13页
The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are ca... The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are called causative availability indiscriminate attacks.Facing the problem that existing data sanitization methods are hard to apply to real-time applications due to their tedious process and heavy computations,we propose a new supervised batch detection method for poison,which can fleetly sanitize the training dataset before the local model training.We design a training dataset generation method that helps to enhance accuracy and uses data complexity features to train a detection model,which will be used in an efficient batch hierarchical detection process.Our model stockpiles knowledge about poison,which can be expanded by retraining to adapt to new attacks.Being neither attack-specific nor scenario-specific,our method is applicable to FL/DML or other online or offline scenarios. 展开更多
关键词 Distributed machine learning security Federated learning data poisoning attacks data sanitization Batch detection data complexity
下载PDF
Machine Learning Security Defense Algorithms Based on Metadata Correlation Features
9
作者 Ruchun Jia Jianwei Zhang Yi Lin 《Computers, Materials & Continua》 SCIE EI 2024年第2期2391-2418,共28页
With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The networ... With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data. 展开更多
关键词 data-oriented architecture METAdata correlation features machine learning security defense data source integration
下载PDF
Analysis of Secured Cloud Data Storage Model for Information
10
作者 Emmanuel Nwabueze Ekwonwune Udo Chukwuebuka Chigozie +1 位作者 Duroha Austin Ekekwe Georgina Chekwube Nwankwo 《Journal of Software Engineering and Applications》 2024年第5期297-320,共24页
This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hac... This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hackers, thereby making customer/client data visible and unprotected. Also, this led to enormous risk of the clients/customers due to defective equipment, bugs, faulty servers, and specious actions. The aim if this paper therefore is to analyze a secure model using Unicode Transformation Format (UTF) base 64 algorithms for storage of data in cloud securely. The methodology used was Object Orientated Hypermedia Analysis and Design Methodology (OOHADM) was adopted. Python was used to develop the security model;the role-based access control (RBAC) and multi-factor authentication (MFA) to enhance security Algorithm were integrated into the Information System developed with HTML 5, JavaScript, Cascading Style Sheet (CSS) version 3 and PHP7. This paper also discussed some of the following concepts;Development of Computing in Cloud, Characteristics of computing, Cloud deployment Model, Cloud Service Models, etc. The results showed that the proposed enhanced security model for information systems of cooperate platform handled multiple authorization and authentication menace, that only one login page will direct all login requests of the different modules to one Single Sign On Server (SSOS). This will in turn redirect users to their requested resources/module when authenticated, leveraging on the Geo-location integration for physical location validation. The emergence of this newly developed system will solve the shortcomings of the existing systems and reduce time and resources incurred while using the existing system. 展开更多
关键词 CLOUD data Information Model data Storage Cloud Computing Security System data Encryption
下载PDF
When cryptography stops data science: strategies for resolving the conflicts between data scientists and cryptographers
11
作者 Banaeian Far Saeed Imani Rad Azadeh 《Data Science and Management》 2024年第3期238-255,共18页
The advent of the digital era and computer-based remote communications has significantly enhanced the applicability of various sciences over the past two decades,notably data science(DS)and cryptography(CG).Data scien... The advent of the digital era and computer-based remote communications has significantly enhanced the applicability of various sciences over the past two decades,notably data science(DS)and cryptography(CG).Data science involves clustering and categorizing unstructured data,while cryptography ensures security and privacy aspects.Despite certain CG laws and requirements mandating fully randomized or pseudonoise outputs from CG primitives and schemes,it appears that CG policies might impede data scientists from working on ciphers or analyzing information systems supporting security and privacy services.However,this study posits that CG does not entirely preclude data scientists from operating in the presence of ciphers,as there are several examples of successful collaborations,including homomorphic encryption schemes,searchable encryption algorithms,secret-sharing protocols,and protocols offering conditional privacy.These instances,along with others,indicate numerous potential solutions for fostering collaboration between DS and CG.Therefore,this study classifies the challenges faced by DS and CG into three distinct groups:challenging problems(which can be conditionally solved and are currently available to use;e.g.,using secret sharing protocols,zero-knowledge proofs,partial homomorphic encryption algorithms,etc.),open problems(where proofs to solve exist but remain unsolved and is now considered as open problems;e.g.,proposing efficient functional encryption algorithm,fully homomorphic encryption scheme,etc.),and hard problems(infeasible to solve with current knowledge and tools).Ultimately,the paper will address specific solutions and outline future directions to tackle the challenges arising at the intersection of DS and CG,such as providing specific access for DS experts in secret-sharing algorithms,assigning data index dimensions to DS experts in ultra-dimension encryption algorithms,defining some functional keys in functional encryption schemes for DS experts,and giving limited shares of data to them for analytics. 展开更多
关键词 Big data data mining Homomorphic calculation Randomized data analytic Searchable encryption
下载PDF
Designing and Implementing an Advanced Big Data Governance Platform
12
作者 Yekun Chen Tianqi Xu Yongjiang Xue 《Journal of Electronic Research and Application》 2024年第3期13-19,共7页
Contemporary mainstream big data governance platforms are built atop the big data ecosystem components,offering a one-stop development and analysis governance platform for the collection,transmission,storage,cleansing... Contemporary mainstream big data governance platforms are built atop the big data ecosystem components,offering a one-stop development and analysis governance platform for the collection,transmission,storage,cleansing,transformation,querying and analysis,data development,publishing,and subscription,sharing and exchange,management,and services of massive data.These platforms serve various role members who have internal and external data needs.However,in the era of big data,the rapid update and iteration of big data technologies,the diversification of data businesses,and the exponential growth of data present more challenges and uncertainties to the construction of big data governance platforms.This paper discusses how to effectively build a data governance platform under the big data system from the perspectives of functional architecture,logical architecture,data architecture,and functional design. 展开更多
关键词 Big data data governance Cleansing and transformation data development Sharing and exchange
下载PDF
Using Python to Analyze Financial Big Data
13
作者 Xuanrui Zhu 《Journal of Electronic Research and Application》 2024年第5期12-20,共9页
As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is r... As technology and the internet develop,more data are generated every day.These data are in large sizes,high dimensions,and complex structures.The combination of these three features is the“Big Data”[1].Big data is revolutionizing all industries,bringing colossal impacts to them[2].Many researchers have pointed out the huge impact that big data can have on our daily lives[3].We can utilize the information we obtain and help us make decisions.Also,the conclusions we drew from the big data we analyzed can be used as a prediction for the future,helping us to make more accurate and benign decisions earlier than others.If we apply these technics in finance,for example,in stock,we can get detailed information for stocks.Moreover,we can use the analyzed data to predict certain stocks.This can help people decide whether to buy a stock or not by providing predicted data for people at a certain convincing level,helping to protect them from potential losses. 展开更多
关键词 Big data finance Big data in financial services Big data in risk management AI Machine learning
下载PDF
Privacy Protection for Big Data Linking using the Identity Correlation Approach 被引量:1
14
作者 Kevin McCormack Mary Smyth 《Journal of Statistical Science and Application》 2017年第3期81-90,共10页
Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Proj... Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Project' (SESADP). The result of the project was the creation of datasets and statistical outputs for the years 2011 to 2014 to meet Eurostat's annual earnings statistics requirements and the Structure of Earnings Survey (SES) Regulation. Record linking across the Census and various public sector datasets enabled the necessary information to be acquired to meet the Eurostat earnings requirements. However, the risk of statistical disclosure (i.e. identifying an individual on the dataset) is high unless privacy and confidentiality safe-guards are built into the data matching process. This paper looks at the three methods of linking records on big datasets employed on the SESADP, and how to anonymise the data to protect the identity of the individuals, where potentially disclosive variables exist. 展开更多
关键词 Big data Linking data Matching data Privacy data Confidentiality Identity Correlation Approach data Disclosure data Mining
下载PDF
Hadoop-based secure storage solution for big data in cloud computing environment 被引量:1
15
作者 Shaopeng Guan Conghui Zhang +1 位作者 Yilin Wang Wenqing Liu 《Digital Communications and Networks》 SCIE CSCD 2024年第1期227-236,共10页
In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose... In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average. 展开更多
关键词 Big data security data encryption HADOOP Parallel encrypted storage Zookeeper
下载PDF
Defect Detection Model Using Time Series Data Augmentation and Transformation 被引量:1
16
作者 Gyu-Il Kim Hyun Yoo +1 位作者 Han-Jin Cho Kyungyong Chung 《Computers, Materials & Continua》 SCIE EI 2024年第2期1713-1730,共18页
Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal depende... Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight. 展开更多
关键词 Defect detection time series deep learning data augmentation data transformation
下载PDF
Search Processes in the Exploration of Complex Data under Different Display Conditions
17
作者 Charles Tatum David Dickason 《Journal of Data Analysis and Information Processing》 2021年第2期51-62,共12页
The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 pa... The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 participants, 39 undergraduate students (novice users) and 25 graduate students (intermediate-level users) participated in the study. The experimental design was 2 × 2 × 2 × 3 mixed design using two between-subject variables (display complexity, user experience) and two within-subject variables (display format, question difficulty). The results indicated that response time was superior for graphs (relative to tables), especially when the questions were difficult. The intermediate users seemed to adopt more extensive search strategies than novices, as revealed by an analysis of the number of changes they made to the display prior to answering questions. It was concluded that designers of data displays should consider the (a) type of display, (b) difficulty of the task, and (c) expertise level of the user to obtain optimal levels of performance. 展开更多
关键词 Computer Users data Displays data Visualization data Tables data Graphs Visual Search data Complexity Visual Displays Visual data
下载PDF
Research of data architecture in digital ocean
18
作者 张峰 李四海 石绥祥 《Marine Science Bulletin》 CAS 2010年第2期85-96,共12页
The characters of marine data, such as multi-source, polymorphism, diversity and large amount, determine their differences from other data. How to store and manage marine data rationally and effectively to provide pow... The characters of marine data, such as multi-source, polymorphism, diversity and large amount, determine their differences from other data. How to store and manage marine data rationally and effectively to provide powerful data support for marine management information system and "Digital Ocean" prototype system construction is an urgent problem to solve. Different types of system planning data, such as marine resource, marine environment, marine econotny and marine management, and establishing marine data architecture frame with uniform standard are to realize the effective management of all level marine data, such as national marine data, the provincial (municipal) marine data, and meet the need of fundamental information-platform construction. 展开更多
关键词 digital ocean data architecture data warehouse data mart METAdata
下载PDF
Big Metadata,Smart Metadata,and Metadata Capital:Toward Greater Synergy Between Data Science and Metadata 被引量:6
19
作者 Jane Greenberg 《Journal of Data and Information Science》 CSCD 2017年第3期19-36,共18页
Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advanc... Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advancing the synergy between metadata and data science, and identifies pathways for developing a more cohesive metadata research agenda in data science. Design/methodology/approach: This paper identifies factors that challenge metadata research in the digital ecosystem, defines metadata and data science, and presents the concepts big metadata, smart metadata, and metadata capital as part of a metadata lingua franca connecting to data science. Findings: The "utilitarian nature" and "historical and traditional views" of metadata are identified as two intersecting factors that have inhibited metadata research. Big metadata, smart metadata, and metadata capital are presented as part ofa metadata linguafranca to help frame research in the data science research space. Research limitations: There are additional, intersecting factors to consider that likely inhibit metadata research, and other significant metadata concepts to explore. Practical implications: The immediate contribution of this work is that it may elicit response, critique, revision, or, more significantly, motivate research. The work presented can encourage more researchers to consider the significance of metadata as a research worthy topic within data science and the larger digital ecosystem. Originality/value: Although metadata research has not kept pace with other data science topics, there is little attention directed to this problem. This is surprising, given that metadata is essential for data science endeavors. This examination synthesizes original and prior scholarship to provide new grounding for metadata research in data science. 展开更多
关键词 Metadata research data science Big metadata Smart metadata Metadata capital
下载PDF
Redundant Data Detection and Deletion to Meet Privacy Protection Requirements in Blockchain-Based Edge Computing Environment
20
作者 Zhang Lejun Peng Minghui +6 位作者 Su Shen Wang Weizheng Jin Zilong Su Yansen Chen Huiling Guo Ran Sergey Gataullin 《China Communications》 SCIE CSCD 2024年第3期149-159,共11页
With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for clou... With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis. 展开更多
关键词 blockchain data integrity edge computing privacy protection redundant data
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部