(2)假如我从2010年1月1日开始,每月第一个交易日买入1手股票,每年最后一个交易日卖出所有股票,到今天为止,我的收益如何?
(5)假如我从2010年1月1日开始,每月第一个交易日买入一手股票,每年最后一个交易日卖出所有股票,到今天为止,我的收益如何?
前言
Tushare是一个免费的、开源的python财经数据接口包。主要实现对股票等金融数据从数据采集、清洗加工到数据存储的过程,能够为金融分析人员提供快速、整洁和多样的便于分析的数据,为他们在数据获取方面极大地减轻工作量,使他们更加专注于策略和模型的研究与实现上。考虑到python pandas包在金融量化分析中体现出的优势,Tushare返回的绝大部分的数据格式pandas DataFrame类型,非常便于用pandas/NumPy/Matplotlib进行数据分析和可视化。当然,如果习惯了用Excel或者关系型数据库做分析,也可以通过Tushare的数据存储功能,将数据全部保存到本地后进行分析。应一些用户的请求,从0.2.5版本开始,Tushare同时兼容Python 2.x和Python 3.x,对部分代码进行了重构,并优化了一些算法,确保数据获取的高效和稳定。
Tushare从发布到现在,已经帮助很多用户在数据分析方面降低了工作压力,同时也得到很多用户的反馈。Tushare将一如既往的用免费和开源的形式分享出来,希望对有需求的人带来一些帮助。
1.使用对象
【注】最近有人说Tushare不方便看行情,我想说的是,Tushare不是普通炒股者用的软件,而是为那些有兴趣做股票期货数据分析的人提供pandas矩阵数据的工具,至于能不能用来炒股以及效果如何,就看个人能力了。
2.使用前提
- 安装python
- 安装pandas
- lxml也是必须的,正常情况下安装了Anaconda后无法单独安装,如果没有可执行:pip install lxml
- 建议安装Anaconda(http://www.continuum.io/downloads),一次安装包括了Python环境和全部依赖包,减少问题出现的机率。
3.下载安装
4.版本升级
>>> import tushare
>>> print(tushare.__version__)
1.2.85
一、交易数据
交易类数据提供股票的交易行情数据,通过简单的接口调用可获取相应的DataFrame格式数据,主要包括以下类别:
1.历史行情
在Pro版接口中,我们也增加了通用行情接口,可以方便获得各种资源各种频度的数据,欢迎使用。
获取个股历史交易数据(包括均线数据),可以通过参数设置获取日k线,周k线,月k线,以及5分钟、15分钟、30分钟和60分钟k线数据。本接口只能获取近3年的日线数据,适合搭配均线数据进行选股和分析,如果需要全部历史数据,请调用下一个接口:
code:股票代码,即6位数字代码,或者指数代码(sh=上证指数sz=深圳成指hs300=沪深300指数sz50=上证50zxb=中小板cyb=创业板)
ktype:数据类型,D=日 k线 W=周 M=月 5=5分钟 15=15分钟 30=30分钟 60=60分钟,默认为D
open:开盘价
close:收盘价
low:最低价
ma5:5日均价
ma10:10日均价
ma20:20日均价
v_max5:5日均量
v_max10:10日均量
v_max20:20日均量
>>> import tushare as ts
>>> p1=ts.get_hist_data('600848') #一次性获取全部日k线数据
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
>>> p1
open high close low ... v_ma5 v_ma10 v_ma20 turnover
date ...
2022-11-10 11.89 12.00 11.96 11.83 ... 26698.54 29148.51 30058.40 0.09
2022-11-09 12.04 12.15 11.95 11.93 ... 26244.56 31886.21 30317.23 0.12
2022-11-08 11.95 12.08 12.01 11.93 ... 26275.42 34077.42 30761.28 0.08
2022-11-07 11.95 12.04 11.98 11.84 ... 31547.13 35641.61 31334.16 0.13
2022-11-04 11.80 12.02 11.96 11.77 ... 29920.10 35862.92 31142.60 0.13
... ... ... ... ... ... ... ... ... ...
2020-05-18 20.50 20.88 20.53 20.42 ... 50837.18 50837.18 50837.18 0.60
2020-05-15 20.26 20.78 20.61 20.22 ... 48396.27 48396.27 48396.27 0.82
2020-05-14 20.10 20.45 20.18 20.05 ... 36782.36 36782.36 36782.36 0.31
2020-05-13 20.05 20.47 20.29 19.92 ... 39611.55 39611.55 39611.55 0.36
2020-05-12 20.30 20.36 20.06 19.89 ... 42419.52 42419.52 42419.52 0.42
[609 rows x 14 columns]
>>ts.get_hist_data('600848',start='2022-01-05',end='2022-01-09')
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close low ... v_ma5 v_ma10 v_ma20 turnover
date ...
2022-01-07 15.07 15.20 15.13 15.01 ... 59561.83 68826.06 66761.07 0.39
2022-01-06 15.07 15.20 15.05 14.99 ... 64292.86 70723.62 67634.21 0.36
2022-01-05 15.00 15.23 15.07 14.95 ... 88915.92 71704.66 67229.49 0.55
[3 rows x 14 columns]
>>ts.get_hist_data('600848',ktype='W')#获取周k线数据
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma10 v_ma20 turnover
date ...
2022-11-10 11.95 12.15 11.96 ... 167918.90 171783.42 0.43
2022-11-04 11.50 12.02 11.96 ... 179407.99 181925.79 0.63
2022-10-28 12.01 12.13 11.52 ... 179379.23 199834.01 0.89
2022-10-21 12.09 12.28 11.94 ... 180142.60 202218.20 0.47
2022-10-14 11.83 12.13 12.09 ... 186334.12 205772.96 0.66
... ... ... ... ... ... ... ...
1994-04-22 10.91 11.23 9.68 ... 168624.60 168624.60 55.39
1994-04-15 12.20 12.60 10.90 ... 186410.50 186410.50 34.86
1994-04-08 14.58 14.60 12.15 ... 228098.00 228098.00 18.78
1994-04-01 14.99 16.75 14.20 ... 325617.00 325617.00 108.66
1994-03-25 20.50 22.48 15.40 ... 459992.00 459992.00 261.36
[1354 rows x 14 columns]
>>ts.get_hist_data('600848',ktype='M')#获取月k线数据
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma10 v_ma20 turnover
date ...
2022-11-10 11.70 12.15 11.96 ... 833804.25 1149169.85 0.97
2022-10-31 11.83 12.28 11.48 ... 934957.98 1164152.89 2.11
2022-09-30 12.58 13.06 11.82 ... 1024421.91 1161109.39 3.42
2022-08-31 12.20 13.10 12.65 ... 1006450.81 1151565.36 3.83
2022-07-29 13.59 13.68 12.19 ... 971014.76 1137911.91 2.91
... ... ... ... ... ... ... ...
1994-07-29 6.88 6.90 4.20 ... 229020.00 229020.00 45.21
1994-06-30 8.40 8.40 6.85 ... 266384.25 266384.25 43.60
1994-05-31 9.65 10.20 8.40 ... 329597.67 329597.67 42.88
1994-04-29 13.90 14.95 9.65 ... 456663.50 456663.50 168.97
1994-03-31 20.50 22.48 13.90 ... 615940.00 615940.00 349.97
[328 rows x 14 columns]
>>ts.get_hist_data('600848',ktype='5')#获取五分钟k线数据
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma10 v_ma20 turnover
date ...
2022-11-10 15:00:00 11.98 12.00 11.99 ... 408.36 333.36 0.01
2022-11-10 14:55:00 11.98 11.98 11.98 ... 410.56 321.41 0.01
2022-11-10 14:50:00 11.97 11.98 11.98 ... 369.22 314.01 0.00
2022-11-10 14:45:00 11.96 11.98 11.96 ... 350.62 320.11 0.01
2022-11-10 14:40:00 11.94 11.98 11.96 ... 315.52 330.76 0.00
... ... ... ... ... ... ... ...
2022-11-01 14:15:00 11.85 11.88 11.84 ... 628.32 796.86 0.01
2022-11-01 14:10:00 11.82 11.85 11.85 ... 568.72 757.16 0.02
2022-11-01 14:05:00 11.80 11.82 11.82 ... 689.50 683.80 0.00
2022-11-01 14:00:00 11.81 11.84 11.80 ... 800.50 684.60 0.01
2022-11-01 13:55:00 11.78 11.80 11.80 ... 946.60 704.50 0.01
[350 rows x 14 columns]
>>> ts.get_hist_data('600848',ktype='15')#获取15分钟k线数据
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma10 v_ma20 turnover
date ...
2022-11-10 15:00:00 11.97 12.00 11.99 ... 1181.38 1533.34 0.02
2022-11-10 14:45:00 11.97 11.98 11.96 ... 1064.62 1478.21 0.01
2022-11-10 14:30:00 11.99 11.99 11.97 ... 1006.74 1476.03 0.02
2022-11-10 14:15:00 11.95 12.00 11.99 ... 1102.07 1474.00 0.01
2022-11-10 14:00:00 11.90 11.96 11.95 ... 1269.45 1478.69 0.01
... ... ... ... ... ... ... ...
2022-10-12 11:15:00 11.47 11.53 11.48 ... 1671.77 1536.36 0.02
2022-10-12 11:00:00 11.52 11.57 11.48 ... 1561.94 1622.97 0.02
2022-10-12 10:45:00 11.54 11.57 11.52 ... 1546.47 1673.89 0.01
2022-10-12 10:30:00 11.49 11.55 11.51 ... 1552.77 1948.81 0.02
2022-10-12 10:15:00 11.54 11.56 11.48 ... 1434.37 1934.61 0.02
[350 rows x 14 columns]
>>> ts.get_hist_data('600848',ktype='30')#获取30分钟k线数据
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma10 v_ma20 turnover
date ...
2022-11-10 15:00:00 11.97 12.00 11.99 ... 3066.68 2857.67 0.03
2022-11-10 14:30:00 11.95 12.00 11.97 ... 2952.07 2762.90 0.03
2022-11-10 14:00:00 11.90 11.96 11.95 ... 2957.39 2709.08 0.01
2022-11-10 13:30:00 11.93 11.93 11.90 ... 3070.15 2764.78 0.04
2022-11-10 11:30:00 11.94 11.95 11.93 ... 2989.99 2979.70 0.04
... ... ... ... ... ... ... ...
2022-09-02 14:30:00 12.61 12.63 12.59 ... 5233.64 5498.00 0.07
2022-09-02 14:00:00 12.62 12.65 12.63 ... 5485.40 5424.79 0.03
2022-09-02 13:30:00 12.64 12.67 12.62 ... 5944.22 5668.56 0.04
2022-09-02 11:30:00 12.65 12.67 12.65 ... 6084.60 5942.25 0.04
2022-09-02 11:00:00 12.66 12.68 12.65 ... 6234.58 6127.09 0.04
[350 rows x 14 columns]
>>>sh1=ts.get_hist_data('sh',ktype='D')
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
sh1["price_change"]
date
2022-11-10 -12.04
2022-11-09 -16.32
2022-11-08 -13.33
2022-11-07 7.02
2022-11-04 72.99
...
2020-05-18 6.96
2020-05-15 -1.88
2020-05-14 -27.71
2020-05-13 6.49
2020-05-12 -3.24
Name: price_change, Length: 609, dtype: float64
>>>ts.get_hist_data('sz')
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high ... v_ma10 v_ma20
date ...
2022-11-10 10964.94 10979.71 ... 4.268287e+08 4.089080e+08
2022-11-09 11149.37 11185.51 ... 4.268461e+08 4.052112e+08
2022-11-08 11202.44 11212.54 ... 4.313180e+08 4.028833e+08
2022-11-07 11184.16 11272.80 ... 4.305600e+08 3.967721e+08
2022-11-04 10848.30 11226.21 ... 4.257688e+08 3.889228e+08
... ... ... ... ... ...
2020-05-18 10969.08 11017.50 ... 3.224774e+08 3.224774e+08
2020-05-15 11013.16 11039.68 ... 3.125770e+08 3.125770e+08
2020-05-14 11026.71 11047.38 ... 3.110545e+08 3.110545e+08
2020-05-13 10978.31 11096.62 ... 3.040613e+08 3.040613e+08
2020-05-12 10972.05 11018.93 ... 3.092858e+08 3.092858e+08
[610 rows x 13 columns]
>>>ts.get_hist_data('hs300')
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma5 v_ma10 v_ma20
date ...
2022-11-10 3685.83 3701.53 3685.69 ... 1143043.20 1230205.08 1144956.52
2022-11-09 3750.78 3760.41 3714.27 ... 1157156.68 1244694.45 1142711.33
2022-11-08 3773.66 3779.12 3749.33 ... 1261473.48 1276561.48 1152038.73
2022-11-07 3754.52 3792.79 3775.30 ... 1362013.98 1288498.91 1144502.81
2022-11-04 3646.77 3782.88 3767.17 ... 1379292.53 1288729.40 1130913.42
... ... ... ... ... ... ... ...
2020-05-18 3914.66 3946.43 3922.91 ... 960782.75 960782.75 960782.75
2020-05-15 3941.13 3945.20 3912.82 ... 918974.03 918974.03 918974.03
2020-05-14 3952.20 3952.20 3925.22 ... 916570.77 916570.77 916570.77
2020-05-13 3946.64 3972.53 3968.25 ... 932206.13 932206.13 932206.13
2020-05-12 3961.34 3970.11 3960.24 ... 982505.56 982505.56 982505.56
[609 rows x 13 columns]
>>>ts.get_hist_data('sz50')
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma5 v_ma10 v_ma20
date ...
2022-11-10 2422.01 2451.47 2440.32 ... 343385.46 366714.09 330489.80
2022-11-09 2460.24 2470.41 2440.35 ... 324282.93 364027.32 325701.40
2022-11-08 2476.65 2482.58 2459.09 ... 358614.06 375482.11 327456.54
2022-11-07 2452.01 2490.32 2477.77 ... 404213.09 376622.16 325915.11
2022-11-04 2383.96 2479.23 2468.28 ... 406370.41 375982.12 322553.31
... ... ... ... ... ... ... ...
2020-05-18 2821.73 2849.85 2837.17 ... 203143.18 203143.18 203143.18
2020-05-15 2845.91 2849.17 2819.69 ... 197361.82 197361.82 197361.82
2020-05-14 2854.78 2854.78 2834.53 ... 198247.15 198247.15 198247.15
2020-05-13 2860.82 2869.70 2867.71 ... 198279.01 198279.01 198279.01
2020-05-12 2871.31 2880.50 2869.60 ... 213301.50 213301.50 213301.50
[609 rows x 13 columns]
>>>ts.get_hist_data('zxb')
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma5 v_ma10 v_ma20
date ...
2022-11-10 7443.46 7447.11 7385.15 ... 27761948.8 29729345.2 27252668.3
2022-11-09 7625.30 7631.52 7521.39 ... 28837266.0 30204169.8 27307972.3
2022-11-08 7642.50 7653.29 7605.50 ... 31841529.2 30875491.2 27441360.4
2022-11-07 7648.39 7703.12 7645.53 ... 33458323.6 30875514.0 27216572.1
2022-11-04 7397.68 7676.36 7648.52 ... 33507745.2 30552967.0 26710824.8
... ... ... ... ... ... ... ...
2020-05-18 7149.74 7170.26 7097.87 ... 30687283.2 30687283.2 30687283.2
2020-05-15 7195.64 7227.61 7164.75 ... 29300007.0 29300007.0 29300007.0
2020-05-14 7218.82 7231.74 7163.63 ... 29257854.0 29257854.0 29257854.0
2020-05-13 7178.54 7275.78 7259.07 ... 29198595.0 29198595.0 29198595.0
2020-05-12 7174.98 7204.59 7202.65 ... 29226862.0 29226862.0 29226862.0
[610 rows x 13 columns]
>>> ts.get_hist_data('cyb')
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high close ... v_ma5 v_ma10 v_ma20
date ...
2022-11-10 2380.23 2393.61 2357.13 ... 15429622.60 16807431.50 15938125.75
2022-11-09 2422.17 2432.29 2399.34 ... 15424331.00 17304259.30 15754517.40
2022-11-08 2452.86 2460.99 2432.40 ... 17622242.00 18123539.40 15812627.75
2022-11-07 2447.13 2470.64 2454.69 ... 18872023.20 18551280.60 15616770.90
2022-11-04 2378.42 2464.51 2451.22 ... 18837383.20 18385820.90 15325575.65
... ... ... ... ... ... ... ...
2020-05-18 2122.31 2137.15 2114.86 ... 21334267.60 21334267.60 21334267.60
2020-05-15 2128.88 2138.97 2124.31 ... 20632134.50 20632134.50 20632134.50
2020-05-14 2131.46 2137.14 2117.65 ... 20503188.67 20503188.67 20503188.67
2020-05-13 2120.01 2145.02 2140.68 ... 19869736.00 19869736.00 19869736.00
2020-05-12 2106.22 2125.32 2124.15 ... 19653746.00 19653746.00 19653746.00
[610 rows x 13 columns]
>>> ts.get_hist_data('sz',ktype='M')#获取深圳成指k线数据
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
open high ... v_ma10 v_ma20
date ...
2022-11-10 10420.95 11272.80 ... 8.090958e+09 8.462892e+09
2022-10-31 10788.69 11270.48 ... 8.650721e+09 8.701965e+09
2022-09-30 11789.82 11966.17 ... 9.123413e+09 8.666101e+09
2022-08-31 12243.33 12611.49 ... 9.344029e+09 8.689353e+09
2022-07-29 12899.47 13121.39 ... 8.862613e+09 8.538497e+09
... ... ... ... ... ...
1991-08-31 574.82 580.59 ... 9.599200e+03 9.599200e+03
1991-07-31 685.65 687.48 ... 7.688750e+03 7.688750e+03
1991-06-29 784.48 784.48 ... 3.866000e+03 3.866000e+03
1991-05-31 876.57 876.57 ... 2.624500e+03 2.624500e+03
1991-04-30 988.05 988.05 ... 1.520000e+02 1.520000e+02
[379 rows x 13 columns]
2.【案例应用_1】
(1)使用tushare包获取某股票的历史行情数据
#获取k线数据,加载至DateFrame中,这个是茅台的股票
>>>df=ts.get_k_data("600519",start="1999-01-01")
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
Warning (from warnings module):
File "E:python 3.7libsite-packagestusharestocktrading.py", line 706
data = data.append(_get_k_data(url, dataflag,
FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
#将从Tushare中获取的数据存储至本地
>>> df.to_csv("600519.csv")
...
#将从原数据中的时间作为行索引,并将字符串类型的时间序列化为时间对象类型,并且给显示索引
>>> import pandas as pd
>>> df=pd.read_csv('600519.csv',index_col='date',parse_dates=['date'])[['open','close','high','low']]
>>> df
open close high low
date
2001-08-27 -113.034 -112.849 -112.453 -113.329
2001-08-28 -112.949 -112.616 -112.591 -113.016
2001-08-29 -112.595 -112.702 -112.591 -112.751
2001-08-30 -112.719 -112.574 -112.501 -112.769
2001-08-31 -112.565 -112.590 -112.481 -112.627
... ... ... ... ...
2022-11-04 1437.960 1516.570 1527.770 1437.960
2022-11-07 1494.050 1507.110 1518.480 1486.590
2022-11-08 1508.000 1484.880 1517.990 1468.960
2022-11-09 1483.950 1459.900 1488.880 1450.000
2022-11-10 1444.000 1475.000 1485.000 1435.150
[5071 rows x 4 columns]
(2)假如我从2010年1月1日开始,每月第一个交易日买入1手股票,每年最后一个交易日卖出所有股票,到今天为止,我的收益如何?
>>> df = ts.get_k_data("600519",start="1999-01-01")
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
>>> df.to_csv("D600519.csv")
>>>df=pd.read_csv('600519.csv',index_col='date',parse_dates=['date'])[['open','close','high','low']]
>>>price_last = df['open'][-1]
>>>df = df['2010-01':'2021-02'] #剔除首尾无用的数据
>>>print(df)
open close high low
date
2010-01-04 13.919 12.371 13.919 11.898
2010-01-05 13.160 11.996 13.543 11.665
2010-01-06 11.658 9.982 12.041 9.644
2010-01-07 9.982 7.698 10.305 6.316
2010-01-08 7.909 6.406 7.909 4.978
... ... ... ... ...
2021-02-22 2414.032 2247.052 2414.032 2237.252
2021-02-23 2224.172 2266.032 2303.922 2224.032
2021-02-24 2267.022 2148.032 2277.032 2119.532
2021-02-25 2168.032 2109.032 2183.532 2080.242
2021-02-26 2059.032 2081.812 2138.982 2026.332
[2703 rows x 4 columns]
>>> #Pandas提供了resample函数用便捷的方式对时间序列进行重采样,根据时间粒度的变大或者变小分为降采样和升采样:
... df_monthly = df.resample("M").first()
>>> #print(df_monthly)
... df_yearly = df.resample("A").last()[:-1] #去除最后一年
>>> print(df_yearly)
open close high low
date
2010-12-31 22.172 23.765 24.035 21.608
2011-12-31 46.568 47.064 48.370 44.337
2012-12-31 66.904 63.359 68.136 61.152
2013-12-31 -1.601 2.019 2.788 -2.840
2014-12-31 68.262 71.917 72.262 67.717
2015-12-31 121.731 121.701 123.011 121.331
2016-12-31 236.292 243.832 244.972 236.292
2017-12-31 634.469 613.959 642.969 608.069
2018-12-31 490.768 517.478 523.868 487.468
2019-12-31 1125.007 1125.007 1130.007 1118.517
2020-12-31 1900.032 1957.032 1958.012 1898.032
>>> cost_money = 0
>>> hold = 0 #每年持有的股票
>>> for year in range(2010, 2017):
...
... cost_money -= df_monthly.loc[str(year)]['open'].sum()*100
... hold += len(df_monthly[str(year)]['open']) * 100
... if year != 2019:
... cost_money += df_yearly[str(year)]['open'][0] * hold
... hold = 0 #每年持有的股票
...
Warning (from warnings module):
File "<pyshell#72>", line 4
FutureWarning: Indexing a DataFrame with a datetimelike index using a single string to slice the rows, like `frame[string]`, is deprecated and will be removed in a future version. Use `frame.loc[string]` instead.
Warning (from warnings module):
File "<pyshell#72>", line 6
FutureWarning: Indexing a DataFrame with a datetimelike index using a single string to slice the rows, like `frame[string]`, is deprecated and will be removed in a future version. Use `frame.loc[string]` instead.
>>> cost_money += hold * price_last
>>> print(cost_money)
122939.49999999994
>>> import matplotlib.pyplot as plt
>>> df['close'].plot()
<AxesSubplot: xlabel='date'>
>>> plt.show()
Traceback (most recent call last):
File "<pyshell#77>", line 1, in <module>
plt.show()
File "E:python 3.7libsite-packagesmatplotlibpyplot.py", line 421, in show
return _get_backend_mod().show(*args, **kwargs)
File "E:python 3.7libsite-packagesmatplotlibbackend_bases.py", line 3546, in show
cls.mainloop()
File "E:python 3.7libsite-packagesmatplotlibbackends_backend_tk.py", line 1040, in mainloop
first_manager.window.mainloop()
File "E:python 3.7libtkinter__init__.py", line 1458, in mainloop
self.tk.mainloop(n)
KeyboardInterrupt
>>> df.describe()
open close high low
count 2703.000000 2703.000000 2703.000000 2703.000000
mean 368.869292 369.726308 375.221292 363.552602
std 491.748814 492.813804 498.445099 486.208050
min -20.641000 -20.439000 -19.515000 -21.017000
25% 41.771500 41.944500 43.649500 39.952500
50% 111.281000 111.251000 114.911000 107.511000
75% 597.259000 597.373500 605.633000 589.018000
max 2547.012000 2560.032000 2586.912000 2444.032000
>>> df1=ts.get_k_data("002075",start="1999-01-01")
本接口即将停止更新,请尽快使用Pro版接口:https://tushare.pro/document/2
>>> df1.to_csv("002075.csv")
...
>>> import pandas as pd
>>> df1=pd.read_csv("002075.csv",index_col='date',parse_dates=['date'])[['open','close','high','low']]
>>> import matplotlib.pyplot as plt
>>> print(df)
open close high low
date
2010-01-04 13.919 12.371 13.919 11.898
2010-01-05 13.160 11.996 13.543 11.665
2010-01-06 11.658 9.982 12.041 9.644
2010-01-07 9.982 7.698 10.305 6.316
2010-01-08 7.909 6.406 7.909 4.978
... ... ... ... ...
2021-02-22 2414.032 2247.052 2414.032 2237.252
2021-02-23 2224.172 2266.032 2303.922 2224.032
2021-02-24 2267.022 2148.032 2277.032 2119.532
2021-02-25 2168.032 2109.032 2183.532 2080.242
2021-02-26 2059.032 2081.812 2138.982 2026.332
[2703 rows x 4 columns]
>>> df1['close'].plot()
<AxesSubplot: xlabel='date'>
>>> df1.describe()
open close high low
count 2947.000000 2947.000000 2947.000000 2947.000000
mean 6.061386 6.070188 6.226058 5.915490
std 3.803842 3.812494 3.929221 3.692869
min 0.697000 0.711000 0.747000 0.675000
25% 2.933000 2.940000 3.011000 2.861000
50% 4.861000 4.861000 4.997000 4.790000
75% 8.601000 8.625000 8.869500 8.283000
max 23.947000 23.883000 24.625000 22.933000
(3)输出该股票所有收盘比开盘上涨3%以上的日期
>>> condition=(df1['close']-df1['open'])/df1['open']>0.03
>>> df1.loc[condition].index
DatetimeIndex(['2006-10-25', '2006-10-30', '2006-11-01', '2006-11-08',
'2006-11-09', '2006-11-13', '2006-11-27', '2006-12-01',
'2006-12-11', '2007-01-05',
...
'2022-02-07', '2022-02-18', '2022-02-21', '2022-04-06',
'2022-04-29', '2022-05-12', '2022-07-05', '2022-07-18',
'2022-08-05', '2022-10-17'],
dtype='datetime64[ns]', name='date', length=427, freq=None)
>>> df1.loc[condition]
open close high low
date
2006-10-25 2.325 2.583 2.611 2.325
2006-10-30 2.536 2.704 2.711 2.511
2006-11-01 2.665 2.797 2.965 2.650
2006-11-08 2.711 2.868 2.915 2.665
2006-11-09 2.858 2.954 3.058 2.808
... ... ... ... ...
2022-05-12 4.581 4.741 4.881 4.571
2022-07-05 4.820 4.970 5.100 4.810
2022-07-18 4.450 4.590 4.610 4.450
2022-08-05 4.250 4.450 4.650 4.250
2022-10-17 3.950 4.320 4.320 3.930
[427 rows x 4 columns]
(4)输出该股票所有开盘比前日收盘跌幅超过2%的日期
>>> #因为是与前日做对比
>>> #shift(1) 行索引不,值向下移动一位
>>> condition=(df1['open']-df1['close'].shift(1))/df1['close'].shift(1)<=-0.02
>>> condition
date
2006-10-25 False
2006-10-26 False
2006-10-27 True
2006-10-30 False
2006-10-31 False
...
2022-11-04 False
2022-11-07 False
2022-11-08 False
2022-11-09 False
2022-11-10 False
Length: 2947, dtype: bool
>>> df1[condition].index
DatetimeIndex(['2006-10-27', '2006-12-08', '2007-01-25', '2007-02-05',
'2007-03-19', '2007-05-14', '2007-05-16', '2007-05-21',
'2007-06-04', '2007-06-05',
...
'2021-07-08', '2021-07-09', '2021-07-12', '2021-07-13',
'2021-07-14', '2021-09-22', '2021-11-03', '2022-02-22',
'2022-02-24', '2022-05-06'],
dtype='datetime64[ns]', name='date', length=194, freq=None)
>>> df1[condition]
open close high low
date
2006-10-27 2.550 2.540 2.568 2.461
2006-12-08 3.218 3.040 3.254 3.033
2007-01-25 3.575 3.418 3.668 3.397
2007-02-05 3.075 3.161 3.197 3.075
2007-03-19 3.497 3.618 3.654 3.415
... ... ... ... ...
2021-09-22 7.161 7.301 7.351 7.151
2021-11-03 5.331 5.581 5.771 5.331
2022-02-22 6.921 6.701 7.061 6.691
2022-02-24 6.381 6.161 6.521 6.071
2022-05-06 4.621 4.581 4.661 4.561
[194 rows x 4 columns]
(5)假如我从2010年1月1日开始,每月第一个交易日买入一手股票,每年最后一个交易日卖出所有股票,到今天为止,我的收益如何?
>>> df1=pd.read_csv("002075.csv",index_col='date',parse_dates=['date'])[['open','close','high','low']]
>>> price_last=df1['open'][-1]#剔除首位无用的数据
>>> df1=df1['2010-01':'2021-02']
>>> #prinf(df)
>>> #Pandas提供了resample函数用便捷的方式对时间序列进行重采样,根据时间粒度的变大或者变小分为降采样和升采样
>>> df_monthly=df1.resample("M").first()
>>> #print(df_monthly)
>>> df_yearly=df1.resample("A").last()[:-1] #去除最后一年
>>> #print(df_yearly)
>>> cost_money=0
>>> hold=0 #每年持有的股票
>>> for year in range(2010,2017):
... cost_money -= df_monthly.loc[str(year)]['open'].sum()*100
... hold += len(df_monthly[str(year)]['open'])*100
... if year!=2019:
... cost_money += df_yearly[str(year)]['open'][0]*hold
... hold=0
...
...
Warning (from warnings module):
File "<pyshell#146>", line 3
FutureWarning: Indexing a DataFrame with a datetimelike index using a single string to slice the rows, like `frame[string]`, is deprecated and will be removed in a future version. Use `frame.loc[string]` instead.
Warning (from warnings module):
File "<pyshell#146>", line 5
FutureWarning: Indexing a DataFrame with a datetimelike index using a single string to slice the rows, like `frame[string]`, is deprecated and will be removed in a future version. Use `frame.loc[string]` instead.
>>> cost_money += hold*price_last
>>> print(cost_money)
31657.6
原文地址:https://blog.csdn.net/m0_72318954/article/details/127792114
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。
如若转载,请注明出处:http://www.7code.cn/show_47094.html
如若内容造成侵权/违法违规/事实不符,请联系代码007邮箱:suwngjj01@126.com进行投诉反馈,一经查实,立即删除!