跳转至

加餐04 SQL实战:证券场景的数据查询与优化

你好,我是陈博士。今天我们来探讨一下客户持仓配置建议业务的SQL分析。

合理的资产配置,是确保客户投资组合既能满足其财务目标,又能有效管理风险的关键。通过深入的数据查询和分析,我们可以评估客户的当前资产状况、风险承受能力以及投资偏好,从而提供个性化的资产配置建议。

针对该场景,我整理了2张数据表以及对应的查询问题。针对这些查询问题,你可以了解到这些SQL该如何撰写。

数据表

  • 客户持仓表 customer_positions

  • 产品配置表 product_configuration

问题设定

我从客户资产分析、资产配置分析、风险等级分析、产品持仓排名等维度设置了一些常见的查询问题,一起来看一下。

1. 客户资产分析

查询指定客户(customer_id = ‘C00001’)在特定日期(2024-12-21)的各类资产(股票、基金、固收、结构性产品、现金)具体金额是多少?总资产是多少?

SELECT
    customer_id,
    customer_name,
    SUM(stock_amount) as total_stock,
    SUM(fund_amount) as total_fund,
    SUM(fixed_income_amount) as total_fixed_income,
    SUM(structured_amount) as total_structured,
    SUM(cash_amount) as total_cash,
    SUM(stock_amount + fund_amount + fixed_income_amount + structured_amount + cash_amount) as total_assets
FROM customer_positions FORCE INDEX (idx_customer_id)
WHERE customer_id = 'C00001'
AND position_date = '2024-12-21'
GROUP BY customer_id, customer_name;
 
// 说明: 使用 SUM() 聚合函数计算各类资产总金额,并通过 GROUP BY 按客户分组。FORCE INDEX 提示优化查询性能,确保快速定位指定客户和日期的数据。

2. 资产配置分析

在特定日期(2024-12-21),每个客户的股票资产占比、基金资产占、固定收益类占比、结构性产品占比、现金占比是多少?

WITH customer_total AS (
    SELECT
        customer_id,
        SUM(stock_amount + fund_amount + fixed_income_amount + structured_amount + cash_amount) as total_assets
    FROM customer_positions
    WHERE position_date = '2024-12-21'
    GROUP BY customer_id
)
SELECT
    cp.customer_id,
    cp.customer_name,
    ROUND(SUM(stock_amount) / ct.total_assets * 100, 2) as stock_ratio,
    ROUND(SUM(fund_amount) / ct.total_assets * 100, 2) as fund_ratio,
    ROUND(SUM(fixed_income_amount) / ct.total_assets * 100, 2) as fixed_income_ratio,
    ROUND(SUM(structured_amount) / ct.total_assets * 100, 2) as structured_ratio,
    ROUND(SUM(cash_amount) / ct.total_assets * 100, 2) as cash_ratio
FROM customer_positions cp
JOIN customer_total ct ON cp.customer_id = ct.customer_id
WHERE cp.position_date = '2024-12-21'
GROUP BY cp.customer_id, cp.customer_name, ct.total_assets;
 
// 说明:使用CTE预计算每位客户总资产,再通过JOIN结合明细数据计算各类资产占比。ROUND() 函数确保百分比结果的可读性。

3. 风险等级分析

在特定日期(2024-12-21),按风险等级统计,每个风险等级有多少个客户?每个风险等级的客户平均持有多少股票、基金、固收、结构性产品和现金?

SELECT STRAIGHT_JOIN
    risk_level,
    COUNT(DISTINCT customer_id) as customer_count,
    ROUND(AVG(stock_amount), 2) as avg_stock,
    ROUND(AVG(fund_amount), 2) as avg_fund,
    ROUND(AVG(fixed_income_amount), 2) as avg_fixed_income,
    ROUND(AVG(structured_amount), 2) as avg_structured,
    ROUND(AVG(cash_amount), 2) as avg_cash
FROM customer_positions FORCE INDEX (idx_position_date)
WHERE position_date = '2024-12-21'
GROUP BY risk_level
ORDER BY risk_level;
 
// 说明:使用 COUNT(DISTINCT) 统计各风险等级客户数量,AVG() 计算各类资产的平均持有量。STRAIGHT_JOIN 提示优化查询路径,确保高效执行。

4. 产品持仓排名

在特定日期(2024-12-21),哪10个产品持有人数最多?这些产品的总持仓金额是多少?

SELECT
    product_name,
    COUNT(DISTINCT customer_id) as holder_count,
    SUM(stock_amount + fund_amount + fixed_income_amount + structured_amount + cash_amount) as total_amount
FROM customer_positions
WHERE position_date = '2024-12-21'
AND product_name != '现金'
GROUP BY product_name
ORDER BY holder_count DESC, total_amount DESC
LIMIT 10;
 
// 说明:通过 COUNT(DISTINCT) 统计每个产品的持有人数,使用 SUM() 计算总持仓金额。ORDER BY 和 LIMIT 用于筛选出前10个最受欢迎的产品。

5. 客户资产变动分析

每个客户的每日总资产是多少?相比前一日的资产变动额是多少?相比前一日的资产变动百分比是多少?

WITH daily_assets AS (
    SELECT
        customer_id,
        position_date,
        SUM(stock_amount + fund_amount + fixed_income_amount + structured_amount + cash_amount) as total_assets
    FROM customer_positions
    GROUP BY customer_id, position_date
)
SELECT
    a.customer_id,
    a.position_date,
    a.total_assets,
    a.total_assets - LAG(a.total_assets) OVER (PARTITION BY a.customer_id ORDER BY a.position_date) as daily_change,
    ROUND((a.total_assets - LAG(a.total_assets) OVER (PARTITION BY a.customer_id ORDER BY a.position_date)) /
          LAG(a.total_assets) OVER (PARTITION BY a.customer_id ORDER BY a.position_date) * 100, 2) as change_percentage
FROM daily_assets a
ORDER BY a.customer_id, a.position_date;
 
// 说明:使用CTE计算每日总资产,LAG() 函数获取前一日资产值,进而计算日变动额和变动百分比。PARTITION BY 和 ORDER BY 确保按客户和日期正确排序。
 
OVER (PARTITION BY ...) 是 SQL 中窗口函数的一部分,用于定义窗口或分区,即数据集的一个子集。它允许我们在不改变行数的情况下对数据进行分组计算,而窗口函数则在这个定义的窗口上执行累积、移动平均等操作。

6. 产品类型分布

在特定日期(2024-12-21),查询不同类型产品的总持仓金额和客户数。

SELECT
    IFNULL(pc.product_type, 'Unknown') as product_type,
    COUNT(DISTINCT cp.customer_id) as customer_count,
    SUM(CASE
        WHEN pc.product_type = 'STOCK' THEN cp.stock_amount
        WHEN pc.product_type = 'EQUITY' THEN cp.fund_amount
        WHEN pc.product_type = 'FIXED_INCOME' THEN cp.fixed_income_amount
        WHEN pc.product_type = 'STRUCTURED' THEN cp.structured_amount
        ELSE cp.cash_amount
    END) as total_amount
FROM customer_positions cp
LEFT JOIN product_configuration pc ON cp.product_name = pc.product_name
WHERE cp.position_date = '2024-12-21'
GROUP BY pc.product_type WITH ROLLUP;  -- 添加汇总行
 
// 说明:使用 LEFT JOIN 关联产品配置表,通过 CASE 语句按产品类型汇总总持仓金额,COUNT(DISTINCT) 统计客户数,GROUP BY ... WITH ROLLUP 提供分类汇总和总计行。
 
GROUP BY ... WITH ROLLUP 是 SQL 中用于生成分组汇总报表的扩展语法。它在标准 GROUP BY 分组的基础上,为每个分组字段添加了额外的汇总行(即小计和总计),从而提供了更丰富的分析视角。

如果你是证券行业的从业人员,这节课将对你非常有帮助。期待你的转发,我们下节课再见!