【数据竞赛】日本炒股大赛最稳定的金牌方案（含Code）

作者：flaty日本炒股大赛最稳定的金牌方案简介第四名的选手是从第一轮到最后一轮都一直稳居在前排的选手，所以其方案在日本炒股大赛中是非常稳定的，其背后究竟有何秘密，我们一起来学习一下它的开源代码。方案01数据预处理对收盘价进行价格的调整defadjust_price(price):"""Args:price(pd.DataFrame):pd.Dat...

风度78

555人浏览 · 2022-11-26 12:00:57

风度78 · 2022-11-26 12:00:57 发布

作者：flaty

日本炒股大赛最稳定的金牌方案

简介

第四名的选手是从第一轮到最后一轮都一直稳居在前排的选手，所以其方案在日本炒股大赛中是非常稳定的，其背后究竟有何秘密，我们一起来学习一下它的开源代码。

方案

数据预处理

对收盘价进行价格的调整

def adjust_price(price):
    """
    Args:
        price (pd.DataFrame)  : pd.DataFrame include stock_price
    Returns:
        price DataFrame (pd.DataFrame): stock_price with generated AdjustedClose
    """
    # transform Date column into datetime
    price.loc[: ,"Date"] = pd.to_datetime(price.loc[: ,"Date"], format="%Y-%m-%d")

    def generate_adjusted_close(df):
        """
        Args:
            df (pd.DataFrame)  : stock_price for a single SecuritiesCode
        Returns:
            df (pd.DataFrame): stock_price with AdjustedClose for a single SecuritiesCode
        """
        # sort data to generate CumulativeAdjustmentFactor
        df = df.sort_values("Date", ascending=False)#降順(最新のものが先頭)
        # generate CumulativeAdjustmentFactor
        df.loc[:, "CumulativeAdjustmentFactor"] = df["AdjustmentFactor"].cumprod()#cumprodは累積積を求める関数
        # generate AdjustedClose
        df.loc[:, "AdjustedClose"] = (
            df["CumulativeAdjustmentFactor"] * df["Close"]
        ).map(lambda x: float(
            Decimal(str(x)).quantize(Decimal('0.1'), rounding=ROUND_HALF_UP)#四捨五入
        ))
        # reverse order
        df = df.sort_values("Date")#昇順に戻す
        # to fill AdjustedClose, replace 0 into np.nan
        df.loc[df["AdjustedClose"] == 0, "AdjustedClose"] = np.nan
        # forward fill AdjustedClose
        df.loc[:, "AdjustedClose"] = df.loc[:, "AdjustedClose"].ffill()#ffill:前(上)の値に置換
        return df

    # generate AdjustedClose
    price = price.sort_values(["SecuritiesCode", "Date"])
    price = price.groupby("SecuritiesCode").apply(generate_adjusted_close).reset_index(drop=True)

    price.set_index("Date", inplace=True)
    return price

特征

一天的收益回报率
ExpectedDividend

def get_features_for_predict(price, code):
    """
    Args:
        price (pd.DataFrame)  : pd.DataFrame include stock_price
        code (int)  : A local code for a listed company
    Returns:
        feature DataFrame (pd.DataFrame)
    """
    close_col = "AdjustedClose"
    feats = price.loc[price["SecuritiesCode"] == code, ["SecuritiesCode", close_col, "ExpectedDividend"]].copy()

    # calculate return using AdjustedClose
    feats["return_1day"] = feats[close_col].pct_change(1)
    
    # ExpectedDividend
    feats["ExpectedDividend"] = feats["ExpectedDividend"].mask(feats["ExpectedDividend"] > 0, 1)

    # filling data for nan and inf
    feats = feats.fillna(0)
    feats = feats.replace([np.inf, -np.inf], 0)
    # drop AdjustedClose column
    feats = feats.drop([close_col], axis=1)

    return feats

模型预测

作者没有构建复杂的模型，只是两个特征进行简单的想加，这个是非常值得学习和思考的。

df_price = adjust_price(df_price_raw) 
    # get target SecuritiesCodes
    codes = sorted(prices["SecuritiesCode"].unique())

    # generate feature
    feature = pd.concat([get_features_for_predict(df_price, code) for code in codes])
    # filter feature for this iteration
    feature = feature.loc[feature.index == current_date]

    # prediction
    feature.loc[:, "predict"] = feature["return_1day"] + feature["ExpectedDividend"]*100

    # set rank by predict
    feature = feature.sort_values("predict", ascending=True).drop_duplicates(subset=['SecuritiesCode'])
    feature.loc[:, "Rank"] = np.arange(len(feature))
    feature_map = feature.set_index('SecuritiesCode')['Rank'].to_dict()
    sample_prediction['Rank'] = sample_prediction['SecuritiesCode'].map(feature_map)

参考文献

https://www.kaggle.com/code/flat831/4th-place-model/notebook?scriptVersionId=100052889
https://www.kaggle.com/competitions/jpx-tokyo-stock-exchange-prediction/discussion/359151

往期精彩回顾




适合初学者入门人工智能的路线及资料下载(图文+视频)机器学习入门系列下载机器学习及深度学习笔记等资料打印《统计学习方法》的代码复现专辑机器学习交流qq群955171419，加入微信群请扫码

AtomGit 开源协作平台测评赛

瓜分20万奖金获得内推名额丰厚实物奖励易参与易上手

更多推荐

C#联合Halcon深度学习源代码分享1 预处理图像2图像识别测试3误差分析（含导入步骤文档，含中文注释）（附源码链接）

开放原子开发者工作坊

Git基础命令学习

git基础命令学习笔记git init 命令目录变成 Git 可以管理的仓库git add 把文件添加到仓库(可多次add不同的文件)git commit 把文件提交到仓库git satus 命令查看状态，可以让我们时刻掌握仓库当前的状态git diff可以看到指定文件的修改内容

开放原子开发者工作坊

SpringBoot01:Hello,World

开放原子开发者工作坊

所有评论(0)

查看更多评论

风度78

@fengdu78

已为社区贡献31条内容