The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 pa...The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 participants, 39 undergraduate students (novice users) and 25 graduate students (intermediate-level users) participated in the study. The experimental design was 2 × 2 × 2 × 3 mixed design using two between-subject variables (display complexity, user experience) and two within-subject variables (display format, question difficulty). The results indicated that response time was superior for graphs (relative to tables), especially when the questions were difficult. The intermediate users seemed to adopt more extensive search strategies than novices, as revealed by an analysis of the number of changes they made to the display prior to answering questions. It was concluded that designers of data displays should consider the (a) type of display, (b) difficulty of the task, and (c) expertise level of the user to obtain optimal levels of performance.展开更多
The increasing richness of data encourages a comprehensive understanding of economic and financial activities,where variables of interest may include not only scalar(point-like)indicators,but also functional(curve-lik...The increasing richness of data encourages a comprehensive understanding of economic and financial activities,where variables of interest may include not only scalar(point-like)indicators,but also functional(curve-like)and compositional(pie-like)ones.In many research topics,the variables are also chronologically collected across individuals,which falls into the paradigm of longitudinal analysis.The complicated nature of data,however,increases the difficulty of modeling these variables under the classic longitudinal frame-work.In this study,we investigate the linear mixed-effects model(LMM)for such complex data.Different types of variables arefirst consistently represented using the corresponding basis expansions so that the classic LMM can then be conducted on them,which gener-alizes the theoretical framework of LMM to complex data analysis.A number of simulation studies indicate the feasibility and effectiveness of the proposed model.We further illustrate its practical utility in a real data study on Chinese stock market and show that the proposed method can enhance the performance and interpretability of the regression for complex data with diversified characteristics.展开更多
On November 13, 2016, an MW7.8 earthquake struck Kaikoura in South Island of New Zealand. By means of back-projection of array recordings, ASTFs-analysis of global seismic recordings, and joint inversion of global sei...On November 13, 2016, an MW7.8 earthquake struck Kaikoura in South Island of New Zealand. By means of back-projection of array recordings, ASTFs-analysis of global seismic recordings, and joint inversion of global seismic data and co-seismic In SAR data, we investigated complexity of the earthquake source. The result shows that the 2016 MW7.8 Kaikoura earthquake ruptured about 100 s unilaterally from south to northeast(~N28°–33°E), producing a rupture area about 160 km long and about 50 km wide and releasing scalar moment 1.01×1021 Nm. In particular, the rupture area consisted of two slip asperities, with one close to the initial rupture point having a maximal slip value ~6.9 m while the other far away in the northeast having a maximal slip value ~9.3 m. The first asperity slipped for about 65 s and the second one started 40 s after the first one had initiated. The two slipped simultaneously for about 25 s.Furthermore, the first had a nearly thrust slip while the second had both thrust and strike slip. It is interesting that the rupture velocity was not constant, and the whole process may be divided into 5 stages in which the velocities were estimated to be 1.4 km/s, 0 km/s, 2.1 km/s, 0 km/s and 1.1 km/s, respectively. The high-frequency sources distributed nearly along the lower edge of the rupture area, the highfrequency radiating mainly occurred at launching of the asperities, and it seemed that no high-frequency energy was radiated when the rupturing was going to stop.展开更多
Complex engineered systems are often difficult to analyze and design due to the tangled interdependencies among their subsystems and components. Conventional design methods often need exact modeling or accurate struct...Complex engineered systems are often difficult to analyze and design due to the tangled interdependencies among their subsystems and components. Conventional design methods often need exact modeling or accurate structure decomposition, which limits their practical application. The rapid expansion of data makes utilizing data to guide and improve system design indispensable in practical engineering. In this paper, a data driven uncertainty evaluation approach is proposed to support the design of complex engineered systems. The core of the approach is a data-mining based uncertainty evaluation method that predicts the uncertainty level of a specific system design by means of analyzing association relations along different system attributes and synthesizing the information entropy of the covered attribute areas, and a quantitative measure of system uncertainty can be obtained accordingly. Monte Carlo simulation is introduced to get the uncertainty extrema, and the possible data distributions under different situations is discussed in detail The uncertainty values can be normalized using the simulation results and the values can be used to evaluate different system designs. A prototype system is established, and two case studies have been carded out. The case of an inverted pendulum system validates the effectiveness of the proposed method, and the case of an oil sump design shows the practicability when two or more design plans need to be compared. This research can be used to evaluate the uncertainty of complex engineered systems completely relying on data, and is ideally suited for plan selection and performance analysis in system design.展开更多
In studies of HIV, interval-censored data occur naturally. HIV infection time is not usually known exactly, only that it occurred before the survey, within some time interval or has not occurred at the time of the sur...In studies of HIV, interval-censored data occur naturally. HIV infection time is not usually known exactly, only that it occurred before the survey, within some time interval or has not occurred at the time of the survey. Infections are often clustered within geographical areas such as enumerator areas (EAs) and thus inducing unobserved frailty. In this paper we consider an approach for estimating parameters when infection time is unknown and assumed correlated within an EA where dependency is modeled as frailties assuming a normal distribution for frailties and a Weibull distribution for baseline hazards. The data was from a household based population survey that used a multi-stage stratified sample design to randomly select 23,275 interviewed individuals from 10,584 households of whom 15,851 interviewed individuals were further tested for HIV (crude prevalence = 9.1%). A further test conducted among those that tested HIV positive found 181 (12.5%) recently infected. Results show high degree of heterogeneity in HIV distribution between EAs translating to a modest correlation of 0.198. Intervention strategies should target geographical areas that contribute disproportionately to the epidemic of HIV. Further research needs to identify such hot spot areas and understand what factors make these areas prone to HIV.展开更多
Complex survey designs often involve unequal selection probabilities of clus-ters or units within clusters. When estimating models for complex survey data, scaled weights are incorporated into the likelihood, producin...Complex survey designs often involve unequal selection probabilities of clus-ters or units within clusters. When estimating models for complex survey data, scaled weights are incorporated into the likelihood, producing a pseudo likeli-hood. In a 3-level weighted analysis for a binary outcome, we implemented two methods for scaling the sampling weights in the National Health Survey of Pa-kistan (NHSP). For NHSP with health care utilization as a binary outcome we found age, gender, household (HH) goods, urban/rural status, community de-velopment index, province and marital status as significant predictors of health care utilization (p-value < 0.05). The variance of the random intercepts using scaling method 1 is estimated as 0.0961 (standard error 0.0339) for PSU level, and 0.2726 (standard error 0.0995) for household level respectively. Both esti-mates are significantly different from zero (p-value < 0.05) and indicate consid-erable heterogeneity in health care utilization with respect to households and PSUs. The results of the NHSP data analysis showed that all three analyses, weighted (two scaling methods) and un-weighted, converged to almost identical results with few exceptions. This may have occurred because of the large num-ber of 3rd and 2nd level clusters and relatively small ICC. We performed a sim-ulation study to assess the effect of varying prevalence and intra-class correla-tion coefficients (ICCs) on bias of fixed effect parameters and variance components of a multilevel pseudo maximum likelihood (weighted) analysis. The simulation results showed that the performance of the scaled weighted estimators is satisfactory for both scaling methods. Incorporating simulation into the analysis of complex multilevel surveys allows the integrity of the results to be tested and is recommended as good practice.展开更多
In this paper, we analyze the complexity and entropy of different methods of data compression algorithms: LZW, Huffman, Fixed-length code (FLC), and Huffman after using Fixed-length code (HFLC). We test those algorith...In this paper, we analyze the complexity and entropy of different methods of data compression algorithms: LZW, Huffman, Fixed-length code (FLC), and Huffman after using Fixed-length code (HFLC). We test those algorithms on different files of different sizes and then conclude that: LZW is the best one in all compression scales that we tested especially on the large files, then Huffman, HFLC, and FLC, respectively. Data compression still is an important topic for research these days, and has many applications and uses needed. Therefore, we suggest continuing searching in this field and trying to combine two techniques in order to reach a best one, or use another source mapping (Hamming) like embedding a linear array into a Hypercube with other good techniques like Huffman and trying to reach good results.展开更多
We deal with the problem of pinning sampled-data synchronization for a complex network with probabilistic time-varying coupling delay. The sampling period considered here is assumed to be less than a given bound. With...We deal with the problem of pinning sampled-data synchronization for a complex network with probabilistic time-varying coupling delay. The sampling period considered here is assumed to be less than a given bound. Without using the Kronecker product, a new synchronization error system is constructed by using the property of the random variable and input delay approach. Based on the Lyapunov theory, a delay-dependent pinning sampled-data synchronization criterion is derived in terms of linear matrix inequalities (LMIs) that can be solved effectively by using MATLAB LMI toolbox. Numerical examples are provided to demonstrate the effectiveness of the proposed scheme.展开更多
On the basis of software testing tools we developed for progrmnming languages, we firstly present a new control flowgraph model based on block. In view of the notion of block, we extend the traditional program-based s...On the basis of software testing tools we developed for progrmnming languages, we firstly present a new control flowgraph model based on block. In view of the notion of block, we extend the traditional program-based software test data adequacy measurement criteria, and empirically analyze the subsume relation between these measurement criteria. Then, we define four test complexity metrics based on block. They are J-complexity 0; J-complexity 1 ; J-complexity 1 + ; J-complexity 2. Finally, we show the Kiviat diagram that makes software quality visible.展开更多
Baddeleyite is an important mineral geochronometer. It is valued in the U-Pb (ID-TIMS) geochronology more than zircon because of its magmatic origin, while zircon can be metamorphic, hydrothermal or occur as xenocryst...Baddeleyite is an important mineral geochronometer. It is valued in the U-Pb (ID-TIMS) geochronology more than zircon because of its magmatic origin, while zircon can be metamorphic, hydrothermal or occur as xenocrysts. Detailed mineralogical (BSE, KL, etc.) research of baddeleyite started in the Fennoscandian Shield in the 1990s. The mineral was first extracted from the Paleozoic Kovdor deposit, the second-biggest baddeleyite deposit in the world after Phalaborwa (2.1 Ga), South Africa. The mineral was successfully introduced into the U-Pb systematics. This study provides new U-Pb and LA-ICP-MS data on Archean Ti-Mgt and BIF deposits, Paleoproterozoic layered PGE intrusions with Pt-Pd and Cu-Ni reefs and Paleozoic complex deposits (baddeleyite, apatite, foscorite ores, etc.) in the NE Fennoscandian Shield. Data on concentrations of REE in baddeleyite and temperature of the U-Pb systematics closure are also provided. It is shown that baddeleyite plays an important role in the geological history of the Earth, in particular, in the break-up of supercontinents.展开更多
目的运用复杂网络方法探析针灸治疗癫痫的核心腧穴及配伍规律。方法检索中国知网、维普、万方、Web of Science、EMBASE及Pubmed数据库,依据纳入与排除标准筛选文献并建立处方数据库。运用SPSS Modeler软件分析腧穴频次与关联性,运用Gep...目的运用复杂网络方法探析针灸治疗癫痫的核心腧穴及配伍规律。方法检索中国知网、维普、万方、Web of Science、EMBASE及Pubmed数据库,依据纳入与排除标准筛选文献并建立处方数据库。运用SPSS Modeler软件分析腧穴频次与关联性,运用Gephi0.10.1软件建立复杂网络模型,探析针灸治疗癫痫处方核心腧穴与选穴规律。结果最终纳入有效文献144篇,提取199个处方,涉及102个腧穴。百会穴使用频次最高,特定穴以五腧穴、八脉交会穴、背俞穴为主,经络选择上督脉选择的腧穴最多。关联规则分析显示,百会-太冲支持度及置信度最高。复杂网络拓扑结构分析表明,百会、大椎、腰奇、丰隆等36个腧穴为针灸治疗癫痫的核心腧穴。腧穴社团分析显示,督脉循经治痫群、四肢头部远近配穴群及脏腑津液辨证群为针灸治疗癫痫的3大腧穴群。结论针灸治疗癫痫的腧穴配伍要以督脉为主结合脏腑津液辨证选穴,要注重远近配穴的选穴思路。展开更多
文摘The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 participants, 39 undergraduate students (novice users) and 25 graduate students (intermediate-level users) participated in the study. The experimental design was 2 × 2 × 2 × 3 mixed design using two between-subject variables (display complexity, user experience) and two within-subject variables (display format, question difficulty). The results indicated that response time was superior for graphs (relative to tables), especially when the questions were difficult. The intermediate users seemed to adopt more extensive search strategies than novices, as revealed by an analysis of the number of changes they made to the display prior to answering questions. It was concluded that designers of data displays should consider the (a) type of display, (b) difficulty of the task, and (c) expertise level of the user to obtain optimal levels of performance.
基金This research was financially supported by the Natural Science Foundation of China(Nos.71420107025,11701023).
文摘The increasing richness of data encourages a comprehensive understanding of economic and financial activities,where variables of interest may include not only scalar(point-like)indicators,but also functional(curve-like)and compositional(pie-like)ones.In many research topics,the variables are also chronologically collected across individuals,which falls into the paradigm of longitudinal analysis.The complicated nature of data,however,increases the difficulty of modeling these variables under the classic longitudinal frame-work.In this study,we investigate the linear mixed-effects model(LMM)for such complex data.Different types of variables arefirst consistently represented using the corresponding basis expansions so that the classic LMM can then be conducted on them,which gener-alizes the theoretical framework of LMM to complex data analysis.A number of simulation studies indicate the feasibility and effectiveness of the proposed model.We further illustrate its practical utility in a real data study on Chinese stock market and show that the proposed method can enhance the performance and interpretability of the regression for complex data with diversified characteristics.
基金supported by the NSFC project (41474046)the DQJB project (DQJB16B05) of the Institute of Geophysics, CEA
文摘On November 13, 2016, an MW7.8 earthquake struck Kaikoura in South Island of New Zealand. By means of back-projection of array recordings, ASTFs-analysis of global seismic recordings, and joint inversion of global seismic data and co-seismic In SAR data, we investigated complexity of the earthquake source. The result shows that the 2016 MW7.8 Kaikoura earthquake ruptured about 100 s unilaterally from south to northeast(~N28°–33°E), producing a rupture area about 160 km long and about 50 km wide and releasing scalar moment 1.01×1021 Nm. In particular, the rupture area consisted of two slip asperities, with one close to the initial rupture point having a maximal slip value ~6.9 m while the other far away in the northeast having a maximal slip value ~9.3 m. The first asperity slipped for about 65 s and the second one started 40 s after the first one had initiated. The two slipped simultaneously for about 25 s.Furthermore, the first had a nearly thrust slip while the second had both thrust and strike slip. It is interesting that the rupture velocity was not constant, and the whole process may be divided into 5 stages in which the velocities were estimated to be 1.4 km/s, 0 km/s, 2.1 km/s, 0 km/s and 1.1 km/s, respectively. The high-frequency sources distributed nearly along the lower edge of the rupture area, the highfrequency radiating mainly occurred at launching of the asperities, and it seemed that no high-frequency energy was radiated when the rupturing was going to stop.
基金Supported by National Hi-tech Research and Development Program of China(863 Program,Grant No.2015AA042101)
文摘Complex engineered systems are often difficult to analyze and design due to the tangled interdependencies among their subsystems and components. Conventional design methods often need exact modeling or accurate structure decomposition, which limits their practical application. The rapid expansion of data makes utilizing data to guide and improve system design indispensable in practical engineering. In this paper, a data driven uncertainty evaluation approach is proposed to support the design of complex engineered systems. The core of the approach is a data-mining based uncertainty evaluation method that predicts the uncertainty level of a specific system design by means of analyzing association relations along different system attributes and synthesizing the information entropy of the covered attribute areas, and a quantitative measure of system uncertainty can be obtained accordingly. Monte Carlo simulation is introduced to get the uncertainty extrema, and the possible data distributions under different situations is discussed in detail The uncertainty values can be normalized using the simulation results and the values can be used to evaluate different system designs. A prototype system is established, and two case studies have been carded out. The case of an inverted pendulum system validates the effectiveness of the proposed method, and the case of an oil sump design shows the practicability when two or more design plans need to be compared. This research can be used to evaluate the uncertainty of complex engineered systems completely relying on data, and is ideally suited for plan selection and performance analysis in system design.
文摘In studies of HIV, interval-censored data occur naturally. HIV infection time is not usually known exactly, only that it occurred before the survey, within some time interval or has not occurred at the time of the survey. Infections are often clustered within geographical areas such as enumerator areas (EAs) and thus inducing unobserved frailty. In this paper we consider an approach for estimating parameters when infection time is unknown and assumed correlated within an EA where dependency is modeled as frailties assuming a normal distribution for frailties and a Weibull distribution for baseline hazards. The data was from a household based population survey that used a multi-stage stratified sample design to randomly select 23,275 interviewed individuals from 10,584 households of whom 15,851 interviewed individuals were further tested for HIV (crude prevalence = 9.1%). A further test conducted among those that tested HIV positive found 181 (12.5%) recently infected. Results show high degree of heterogeneity in HIV distribution between EAs translating to a modest correlation of 0.198. Intervention strategies should target geographical areas that contribute disproportionately to the epidemic of HIV. Further research needs to identify such hot spot areas and understand what factors make these areas prone to HIV.
文摘Complex survey designs often involve unequal selection probabilities of clus-ters or units within clusters. When estimating models for complex survey data, scaled weights are incorporated into the likelihood, producing a pseudo likeli-hood. In a 3-level weighted analysis for a binary outcome, we implemented two methods for scaling the sampling weights in the National Health Survey of Pa-kistan (NHSP). For NHSP with health care utilization as a binary outcome we found age, gender, household (HH) goods, urban/rural status, community de-velopment index, province and marital status as significant predictors of health care utilization (p-value < 0.05). The variance of the random intercepts using scaling method 1 is estimated as 0.0961 (standard error 0.0339) for PSU level, and 0.2726 (standard error 0.0995) for household level respectively. Both esti-mates are significantly different from zero (p-value < 0.05) and indicate consid-erable heterogeneity in health care utilization with respect to households and PSUs. The results of the NHSP data analysis showed that all three analyses, weighted (two scaling methods) and un-weighted, converged to almost identical results with few exceptions. This may have occurred because of the large num-ber of 3rd and 2nd level clusters and relatively small ICC. We performed a sim-ulation study to assess the effect of varying prevalence and intra-class correla-tion coefficients (ICCs) on bias of fixed effect parameters and variance components of a multilevel pseudo maximum likelihood (weighted) analysis. The simulation results showed that the performance of the scaled weighted estimators is satisfactory for both scaling methods. Incorporating simulation into the analysis of complex multilevel surveys allows the integrity of the results to be tested and is recommended as good practice.
文摘In this paper, we analyze the complexity and entropy of different methods of data compression algorithms: LZW, Huffman, Fixed-length code (FLC), and Huffman after using Fixed-length code (HFLC). We test those algorithms on different files of different sizes and then conclude that: LZW is the best one in all compression scales that we tested especially on the large files, then Huffman, HFLC, and FLC, respectively. Data compression still is an important topic for research these days, and has many applications and uses needed. Therefore, we suggest continuing searching in this field and trying to combine two techniques in order to reach a best one, or use another source mapping (Hamming) like embedding a linear array into a Hypercube with other good techniques like Huffman and trying to reach good results.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61203049 and 61303020)the Natural Science Foundation of Shanxi Province of China(Grant No.2013021018-3)the Doctoral Startup Foundation of Taiyuan University of Science and Technology,China(Grant No.20112010)
文摘We deal with the problem of pinning sampled-data synchronization for a complex network with probabilistic time-varying coupling delay. The sampling period considered here is assumed to be less than a given bound. Without using the Kronecker product, a new synchronization error system is constructed by using the property of the random variable and input delay approach. Based on the Lyapunov theory, a delay-dependent pinning sampled-data synchronization criterion is derived in terms of linear matrix inequalities (LMIs) that can be solved effectively by using MATLAB LMI toolbox. Numerical examples are provided to demonstrate the effectiveness of the proposed scheme.
文摘On the basis of software testing tools we developed for progrmnming languages, we firstly present a new control flowgraph model based on block. In view of the notion of block, we extend the traditional program-based software test data adequacy measurement criteria, and empirically analyze the subsume relation between these measurement criteria. Then, we define four test complexity metrics based on block. They are J-complexity 0; J-complexity 1 ; J-complexity 1 + ; J-complexity 2. Finally, we show the Kiviat diagram that makes software quality visible.
文摘Baddeleyite is an important mineral geochronometer. It is valued in the U-Pb (ID-TIMS) geochronology more than zircon because of its magmatic origin, while zircon can be metamorphic, hydrothermal or occur as xenocrysts. Detailed mineralogical (BSE, KL, etc.) research of baddeleyite started in the Fennoscandian Shield in the 1990s. The mineral was first extracted from the Paleozoic Kovdor deposit, the second-biggest baddeleyite deposit in the world after Phalaborwa (2.1 Ga), South Africa. The mineral was successfully introduced into the U-Pb systematics. This study provides new U-Pb and LA-ICP-MS data on Archean Ti-Mgt and BIF deposits, Paleoproterozoic layered PGE intrusions with Pt-Pd and Cu-Ni reefs and Paleozoic complex deposits (baddeleyite, apatite, foscorite ores, etc.) in the NE Fennoscandian Shield. Data on concentrations of REE in baddeleyite and temperature of the U-Pb systematics closure are also provided. It is shown that baddeleyite plays an important role in the geological history of the Earth, in particular, in the break-up of supercontinents.
文摘目的运用复杂网络方法探析针灸治疗癫痫的核心腧穴及配伍规律。方法检索中国知网、维普、万方、Web of Science、EMBASE及Pubmed数据库,依据纳入与排除标准筛选文献并建立处方数据库。运用SPSS Modeler软件分析腧穴频次与关联性,运用Gephi0.10.1软件建立复杂网络模型,探析针灸治疗癫痫处方核心腧穴与选穴规律。结果最终纳入有效文献144篇,提取199个处方,涉及102个腧穴。百会穴使用频次最高,特定穴以五腧穴、八脉交会穴、背俞穴为主,经络选择上督脉选择的腧穴最多。关联规则分析显示,百会-太冲支持度及置信度最高。复杂网络拓扑结构分析表明,百会、大椎、腰奇、丰隆等36个腧穴为针灸治疗癫痫的核心腧穴。腧穴社团分析显示,督脉循经治痫群、四肢头部远近配穴群及脏腑津液辨证群为针灸治疗癫痫的3大腧穴群。结论针灸治疗癫痫的腧穴配伍要以督脉为主结合脏腑津液辨证选穴,要注重远近配穴的选穴思路。