【数据竞赛】日本炒股大赛最稳定的金牌方案(含Code)
作者:flaty日本炒股大赛最稳定的金牌方案简介第四名的选手是从第一轮到最后一轮都一直稳居在前排的选手,所以其方案在日本炒股大赛中是非常稳定的,其背后究竟有何秘密,我们一起来学习一下它的开源代码。方案01数据预处理对收盘价进行价格的调整defadjust_price(price):"""Args:price(pd.DataFrame):pd.Dat...
·
作者:flaty
日本炒股大赛最稳定的金牌方案
简介
第四名的选手是从第一轮到最后一轮都一直稳居在前排的选手,所以其方案在日本炒股大赛中是非常稳定的,其背后究竟有何秘密,我们一起来学习一下它的开源代码。
方案
01
数据预处理
对收盘价进行价格的调整
def adjust_price(price):
"""
Args:
price (pd.DataFrame) : pd.DataFrame include stock_price
Returns:
price DataFrame (pd.DataFrame): stock_price with generated AdjustedClose
"""
# transform Date column into datetime
price.loc[: ,"Date"] = pd.to_datetime(price.loc[: ,"Date"], format="%Y-%m-%d")
def generate_adjusted_close(df):
"""
Args:
df (pd.DataFrame) : stock_price for a single SecuritiesCode
Returns:
df (pd.DataFrame): stock_price with AdjustedClose for a single SecuritiesCode
"""
# sort data to generate CumulativeAdjustmentFactor
df = df.sort_values("Date", ascending=False)#降順(最新のものが先頭)
# generate CumulativeAdjustmentFactor
df.loc[:, "CumulativeAdjustmentFactor"] = df["AdjustmentFactor"].cumprod()#cumprodは累積積を求める関数
# generate AdjustedClose
df.loc[:, "AdjustedClose"] = (
df["CumulativeAdjustmentFactor"] * df["Close"]
).map(lambda x: float(
Decimal(str(x)).quantize(Decimal('0.1'), rounding=ROUND_HALF_UP)#四捨五入
))
# reverse order
df = df.sort_values("Date")#昇順に戻す
# to fill AdjustedClose, replace 0 into np.nan
df.loc[df["AdjustedClose"] == 0, "AdjustedClose"] = np.nan
# forward fill AdjustedClose
df.loc[:, "AdjustedClose"] = df.loc[:, "AdjustedClose"].ffill()#ffill:前(上)の値に置換
return df
# generate AdjustedClose
price = price.sort_values(["SecuritiesCode", "Date"])
price = price.groupby("SecuritiesCode").apply(generate_adjusted_close).reset_index(drop=True)
price.set_index("Date", inplace=True)
return price
02
特征
一天的收益回报率
ExpectedDividend
def get_features_for_predict(price, code):
"""
Args:
price (pd.DataFrame) : pd.DataFrame include stock_price
code (int) : A local code for a listed company
Returns:
feature DataFrame (pd.DataFrame)
"""
close_col = "AdjustedClose"
feats = price.loc[price["SecuritiesCode"] == code, ["SecuritiesCode", close_col, "ExpectedDividend"]].copy()
# calculate return using AdjustedClose
feats["return_1day"] = feats[close_col].pct_change(1)
# ExpectedDividend
feats["ExpectedDividend"] = feats["ExpectedDividend"].mask(feats["ExpectedDividend"] > 0, 1)
# filling data for nan and inf
feats = feats.fillna(0)
feats = feats.replace([np.inf, -np.inf], 0)
# drop AdjustedClose column
feats = feats.drop([close_col], axis=1)
return feats
03
模型预测
作者没有构建复杂的模型,只是两个特征进行简单的想加,这个是非常值得学习和思考的。
df_price = adjust_price(df_price_raw)
# get target SecuritiesCodes
codes = sorted(prices["SecuritiesCode"].unique())
# generate feature
feature = pd.concat([get_features_for_predict(df_price, code) for code in codes])
# filter feature for this iteration
feature = feature.loc[feature.index == current_date]
# prediction
feature.loc[:, "predict"] = feature["return_1day"] + feature["ExpectedDividend"]*100
# set rank by predict
feature = feature.sort_values("predict", ascending=True).drop_duplicates(subset=['SecuritiesCode'])
feature.loc[:, "Rank"] = np.arange(len(feature))
feature_map = feature.set_index('SecuritiesCode')['Rank'].to_dict()
sample_prediction['Rank'] = sample_prediction['SecuritiesCode'].map(feature_map)
参考文献
https://www.kaggle.com/code/flat831/4th-place-model/notebook?scriptVersionId=100052889
https://www.kaggle.com/competitions/jpx-tokyo-stock-exchange-prediction/discussion/359151
往期精彩回顾
适合初学者入门人工智能的路线及资料下载(图文+视频)机器学习入门系列下载机器学习及深度学习笔记等资料打印《统计学习方法》的代码复现专辑机器学习交流qq群955171419,加入微信群请扫码
更多推荐
已为社区贡献31条内容
所有评论(0)