The Newcomb-Benford law, which describes the uneven distribution of the frequencies of digits in data sets, is by its nature probabilistic. Therefore, the main goal of this work was to derive formulas for the permissi...The Newcomb-Benford law, which describes the uneven distribution of the frequencies of digits in data sets, is by its nature probabilistic. Therefore, the main goal of this work was to derive formulas for the permissible deviations of the above frequencies (confidence intervals). For this, a previously developed method was used, which represents an alternative to the traditional approach. The alternative formula expressing the Newcomb-Benford law is re-derived. As shown in general form, it is numerically equivalent to the original Benford formula. The obtained formulas for confidence intervals for Benford’s law are shown to be useful for checking arrays of numerical data. Consequences for numeral systems with different bases are analyzed. The alternative expression for the frequencies of digits at the second decimal place is deduced together with the corresponding deviation intervals. In general, in this approach, all the presented results are a consequence of the positionality property of digital systems such as decimal, binary, etc.展开更多
From a basic probabilistic argumentation, the Zipfian distribution and Benford’s law are derived. It is argued that Zipf’s law fits to calculate the rank probabilities of identical indistinguishable objects and that...From a basic probabilistic argumentation, the Zipfian distribution and Benford’s law are derived. It is argued that Zipf’s law fits to calculate the rank probabilities of identical indistinguishable objects and that Benford’s distribution fits to calculate the rank probabilities of distinguishable objects. i.e. in the distribution of words in long texts all the words in a given rank are identical, therefore, the rank distribution is Zipfian. In logarithmic tables, the objects with identical 1st digits are distinguishable as there are many different digits in the 2nd, 3rd… places, etc., and therefore the distribution is according to Benford’s law. Pareto 20 - 80 rule is shown to be an outcome of Benford’s distribution as when the number of ranks is about 10 the probability of 20% of the high probability ranks is equal to the probability of the rest of 80% low probability ranks. It is argued that all these distributions, including the central limit theorem, are outcomes of Planck’s law and are the result of the quantization of energy. This argumentation may be considered a physical origin of probability.展开更多
The experimental values of 2059 β-decay half-lives are systematically analyzed and investigated. We have found that they are in satisfactory agreement with Benford's law, which states that the frequency of occurrenc...The experimental values of 2059 β-decay half-lives are systematically analyzed and investigated. We have found that they are in satisfactory agreement with Benford's law, which states that the frequency of occurrence of each figure, 1-9, as the first significant digit in a surprisingly large number of different data sets follows a logarithmic distribution favoring the smaller ones. Benford's logarithmic distribution of β-deeay half-lives can be explained in terms of Neweomb's justification of Benford's law and empirical exponential law of β-decay half-lives. Moreover, we test the calculated values of 6721 β-decay half-lives with the aid of Benford's law. This indicates that Benford's law is useful for theoretical physicists to test their methods for calculating β-decay half-lives.展开更多
Tampering of biometric data has attracted a great deal of attention recently. Furthermore, there could be an intentional or accidental use of a particular biometric sample instead of another for a particular applicati...Tampering of biometric data has attracted a great deal of attention recently. Furthermore, there could be an intentional or accidental use of a particular biometric sample instead of another for a particular application. Therefore, there exists a need to propose a method to detect data tampering, as well as differentiate biometric samples in cases of intentional or accidental use for a different application. In this paper, fingerprint image tampering is studied. Furthermore, optically acquired fingerprints, synthetically generated fingerprints and contact-less acquired fingerprints are studied for separation purposes using the Benford’s law divergence metric. Benford’s law has shown in literature to be very effective in detecting tampering of natural images. In this paper, the Benford’s law features with support vector machine are proposed for the detection of malicious tampering of JPEG fingerprint images. This method is aimed at protecting against insider attackers and hackers. This proposed method detected tampering effectively, with Equal Error Rate (EER) of 2.08%. Again, the experimental results illustrate that, optically acquired fingerprints, synthetically generated fingerprints and contact-less acquired fingerprints can be separated by the proposed method effectively.展开更多
Benford's law is logarithmic law for distribution of leading digits formulated by P[D=d]= log(1+1/d) where d is leading digit or group of digits. It's named by Frank Albert Benford (1938) who formulated mathema...Benford's law is logarithmic law for distribution of leading digits formulated by P[D=d]= log(1+1/d) where d is leading digit or group of digits. It's named by Frank Albert Benford (1938) who formulated mathematical model of this probability. Befbre him, the same observation was made by Simon Newcomb. This law has changed usual preasumption of equal probability of each digit on each position in number.The main characteristic properties of this law are base, scale, sum, inverse and product invariance. Base invariance means that logarithmic law is valid for any base. Inverse invariance means that logarithmic law for leading digits holds for inverse values in sample. Multiplication invariance means that if random variable X follows Benford's law and Y is arbitrary random variable with continuous density then XY follows Benford's law too. Sum invariance means that sums of significand are the same for any leading digit or group of digits. In this text method of testing sum invariance property is proposed.展开更多
In the communication field, during transmission, a source signal undergoes a convolutive distortion between its symbols and the channel impulse response. This distortion is referred to as Intersymbol Interference (ISI...In the communication field, during transmission, a source signal undergoes a convolutive distortion between its symbols and the channel impulse response. This distortion is referred to as Intersymbol Interference (ISI) and can be reduced significantly by applying a blind adaptive deconvolution process (blind adaptive equalizer) on the distorted received symbols. But, since the entire blind deconvolution process is carried out with no training symbols and the channel’s coefficients are obviously unknown to the receiver, no actual indication can be given (via the mean square error (MSE) or ISI expression) during the deconvolution process whether the blind adaptive equalizer succeeded to remove the heavy ISI from the transmitted symbols or not. Up to now, the output of a convolution and deconvolution process was mainly investigated from the ISI point of view. In this paper, the output of a convolution and deconvolution process is inspected from the leading digit point of view. Simulation results indicate that for the 4PAM (Pulse Amplitude Modulation) and 16QAM (Quadrature Amplitude Modulation) input case, the number “1” is the leading digit at the output of a convolution and deconvolution process respectively as long as heavy ISI exists. However, this leading digit does not follow exactly Benford’s Law but follows approximately the leading digit (digit 1) of a Gaussian process for independent identically distributed input symbols and a channel with many coefficients.展开更多
文摘The Newcomb-Benford law, which describes the uneven distribution of the frequencies of digits in data sets, is by its nature probabilistic. Therefore, the main goal of this work was to derive formulas for the permissible deviations of the above frequencies (confidence intervals). For this, a previously developed method was used, which represents an alternative to the traditional approach. The alternative formula expressing the Newcomb-Benford law is re-derived. As shown in general form, it is numerically equivalent to the original Benford formula. The obtained formulas for confidence intervals for Benford’s law are shown to be useful for checking arrays of numerical data. Consequences for numeral systems with different bases are analyzed. The alternative expression for the frequencies of digits at the second decimal place is deduced together with the corresponding deviation intervals. In general, in this approach, all the presented results are a consequence of the positionality property of digital systems such as decimal, binary, etc.
文摘From a basic probabilistic argumentation, the Zipfian distribution and Benford’s law are derived. It is argued that Zipf’s law fits to calculate the rank probabilities of identical indistinguishable objects and that Benford’s distribution fits to calculate the rank probabilities of distinguishable objects. i.e. in the distribution of words in long texts all the words in a given rank are identical, therefore, the rank distribution is Zipfian. In logarithmic tables, the objects with identical 1st digits are distinguishable as there are many different digits in the 2nd, 3rd… places, etc., and therefore the distribution is according to Benford’s law. Pareto 20 - 80 rule is shown to be an outcome of Benford’s distribution as when the number of ranks is about 10 the probability of 20% of the high probability ranks is equal to the probability of the rest of 80% low probability ranks. It is argued that all these distributions, including the central limit theorem, are outcomes of Planck’s law and are the result of the quantization of energy. This argumentation may be considered a physical origin of probability.
基金supported by the National Natural Science Foundation of China under Grant Nos. 10675090, 10535010, and 10775068the National Fund for Forstering Talents of Basic Science under Grant No. J0630316+2 种基金the 973 State Key Basic Research and Development Program of China under Grant No. 2007CB815004the CAS Knowledge Innovation Project under Grant No. KJCX2-SW-N02the Research Fund of Doctoral Points under Grant No. 20070284016
文摘The experimental values of 2059 β-decay half-lives are systematically analyzed and investigated. We have found that they are in satisfactory agreement with Benford's law, which states that the frequency of occurrence of each figure, 1-9, as the first significant digit in a surprisingly large number of different data sets follows a logarithmic distribution favoring the smaller ones. Benford's logarithmic distribution of β-deeay half-lives can be explained in terms of Neweomb's justification of Benford's law and empirical exponential law of β-decay half-lives. Moreover, we test the calculated values of 6721 β-decay half-lives with the aid of Benford's law. This indicates that Benford's law is useful for theoretical physicists to test their methods for calculating β-decay half-lives.
文摘Tampering of biometric data has attracted a great deal of attention recently. Furthermore, there could be an intentional or accidental use of a particular biometric sample instead of another for a particular application. Therefore, there exists a need to propose a method to detect data tampering, as well as differentiate biometric samples in cases of intentional or accidental use for a different application. In this paper, fingerprint image tampering is studied. Furthermore, optically acquired fingerprints, synthetically generated fingerprints and contact-less acquired fingerprints are studied for separation purposes using the Benford’s law divergence metric. Benford’s law has shown in literature to be very effective in detecting tampering of natural images. In this paper, the Benford’s law features with support vector machine are proposed for the detection of malicious tampering of JPEG fingerprint images. This method is aimed at protecting against insider attackers and hackers. This proposed method detected tampering effectively, with Equal Error Rate (EER) of 2.08%. Again, the experimental results illustrate that, optically acquired fingerprints, synthetically generated fingerprints and contact-less acquired fingerprints can be separated by the proposed method effectively.
文摘Benford's law is logarithmic law for distribution of leading digits formulated by P[D=d]= log(1+1/d) where d is leading digit or group of digits. It's named by Frank Albert Benford (1938) who formulated mathematical model of this probability. Befbre him, the same observation was made by Simon Newcomb. This law has changed usual preasumption of equal probability of each digit on each position in number.The main characteristic properties of this law are base, scale, sum, inverse and product invariance. Base invariance means that logarithmic law is valid for any base. Inverse invariance means that logarithmic law for leading digits holds for inverse values in sample. Multiplication invariance means that if random variable X follows Benford's law and Y is arbitrary random variable with continuous density then XY follows Benford's law too. Sum invariance means that sums of significand are the same for any leading digit or group of digits. In this text method of testing sum invariance property is proposed.
文摘In the communication field, during transmission, a source signal undergoes a convolutive distortion between its symbols and the channel impulse response. This distortion is referred to as Intersymbol Interference (ISI) and can be reduced significantly by applying a blind adaptive deconvolution process (blind adaptive equalizer) on the distorted received symbols. But, since the entire blind deconvolution process is carried out with no training symbols and the channel’s coefficients are obviously unknown to the receiver, no actual indication can be given (via the mean square error (MSE) or ISI expression) during the deconvolution process whether the blind adaptive equalizer succeeded to remove the heavy ISI from the transmitted symbols or not. Up to now, the output of a convolution and deconvolution process was mainly investigated from the ISI point of view. In this paper, the output of a convolution and deconvolution process is inspected from the leading digit point of view. Simulation results indicate that for the 4PAM (Pulse Amplitude Modulation) and 16QAM (Quadrature Amplitude Modulation) input case, the number “1” is the leading digit at the output of a convolution and deconvolution process respectively as long as heavy ISI exists. However, this leading digit does not follow exactly Benford’s Law but follows approximately the leading digit (digit 1) of a Gaussian process for independent identically distributed input symbols and a channel with many coefficients.