Due to the lack of parallel data in current grammatical error correction(GEC)task,models based on sequence to sequence framework cannot be adequately trained to obtain higher performance.We propose two data synthesis ...Due to the lack of parallel data in current grammatical error correction(GEC)task,models based on sequence to sequence framework cannot be adequately trained to obtain higher performance.We propose two data synthesis methods which can control the error rate and the ratio of error types on synthetic data.The first approach is to corrupt each word in the monolingual corpus with a fixed probability,including replacement,insertion and deletion.Another approach is to train error generation models and further filtering the decoding results of the models.The experiments on different synthetic data show that the error rate is 40%and that the ratio of error types is the same can improve the model performance better.Finally,we synthesize about 100 million data and achieve comparable performance as the state of the art,which uses twice as much data as we use.展开更多
Restricted by the availability of investors’account data,existing studies know little about the reasons for differences in investors’return in financial markets.Given this,this paper,based on the unique account data...Restricted by the availability of investors’account data,existing studies know little about the reasons for differences in investors’return in financial markets.Given this,this paper,based on the unique account data,reveals that the differences in investors’return are correlated to their locations in the social network.Conclusions are as follows.(1)Investors’social network constructed based on the submission time of completed orders describes the information diffusion process of financial markets.Information diffuses from the center of the network to the edge,and investors’return depends on their position in the network.(2)Investors’social network affects their return through the positive spillover mechanism of their behavior.Wealthy investors are in the center of the social network,the stronger the information sharing,the higher the status in the network,the higher the return;while retail investors are on the edge of the social network,and when their network centrality is certain,they even suffer return penalty for information sharing.(3)The speed of information diffusion in investors’social network has an important impact on asset pricing.Stocks’volatility,return,and liquidity are high in financial markets with an intermediate level of information diffusion speed.This paper puts forward new reasons for differences in investors’return from the perspective of investors’social network,and holds that big data in the capital market deserve further exploration with the social network method.展开更多
基金was supported by the funds of Bejing Advanced Innovation Center for Language Resources.(TYZ19005)Research Program of State Language Commission(ZDI135-105,YB135-89).
文摘Due to the lack of parallel data in current grammatical error correction(GEC)task,models based on sequence to sequence framework cannot be adequately trained to obtain higher performance.We propose two data synthesis methods which can control the error rate and the ratio of error types on synthetic data.The first approach is to corrupt each word in the monolingual corpus with a fixed probability,including replacement,insertion and deletion.Another approach is to train error generation models and further filtering the decoding results of the models.The experiments on different synthetic data show that the error rate is 40%and that the ratio of error types is the same can improve the model performance better.Finally,we synthesize about 100 million data and achieve comparable performance as the state of the art,which uses twice as much data as we use.
基金supported by the National Natural Science Foundation of China(No.71773072)China Postdoctoral Science Foundation(No.2018M630420)Soft Science Research Project of Zhejiang Province(China)(No.2019C25022)。
文摘Restricted by the availability of investors’account data,existing studies know little about the reasons for differences in investors’return in financial markets.Given this,this paper,based on the unique account data,reveals that the differences in investors’return are correlated to their locations in the social network.Conclusions are as follows.(1)Investors’social network constructed based on the submission time of completed orders describes the information diffusion process of financial markets.Information diffuses from the center of the network to the edge,and investors’return depends on their position in the network.(2)Investors’social network affects their return through the positive spillover mechanism of their behavior.Wealthy investors are in the center of the social network,the stronger the information sharing,the higher the status in the network,the higher the return;while retail investors are on the edge of the social network,and when their network centrality is certain,they even suffer return penalty for information sharing.(3)The speed of information diffusion in investors’social network has an important impact on asset pricing.Stocks’volatility,return,and liquidity are high in financial markets with an intermediate level of information diffusion speed.This paper puts forward new reasons for differences in investors’return from the perspective of investors’social network,and holds that big data in the capital market deserve further exploration with the social network method.