SHUISHAN DATASETS
Open-sourced Dataset(s) from the Shuishan(Sequoia) eLearning platform
Introduction
This repo. is aimed to publicise our datasets collected from the Shuishan(Sequoia) eLearning platform, you can download and use it for your study. All of them are desensitized and easily used, which covers course resources, student behaviour and other data.
Dataests
Current publicly available dataset below here ( only SHUISHAN-CLAD for now ) :
Download
We provide the following download options:
1. Download Links:
- Aliyun OSS
- Baidu Disk
- Link: https://pan.baidu.com/s/1Pbx0u0x2UV0-a9pO8tHtlw
- Password: l9yr
- Google Drive
2. wget
Command:
wget https://shuishan-dataset.oss-accelerate.aliyuncs.com/SHUISHAN-CLAD.zip --no-check-certificate
unzip SHUISHAN-CLAD.zip
wget
command or the Aliyun OSS link for downloading, as these are typically the most up-to-date options. Cloud storage links may not be updated frequently.
Get in Touch
If you wish to use our dataset for research purposes, please contact us via email ( 51265903105@stu.ecnu.edu.cn ) to obtain the necessary authorization. We would also appreciate it if you could acknowledge our contribution in your paper's acknowledgments section.
If you have any questions or suggestions, feel free to reach out to us via email or create GitHub issues.
Data Descriptions
1. SHUISHAN-CLAD: Course Learning Action Dataset
This dataset contains 866,000+ records of course learning actions, including student learning activities, such as watching videos, reading articles, and completing exercises. The dataset is collected from over 113 courses, and the data is collected from over 18,200 students.
The json
files are in the following format:
Detailed Statistics of the SHUISHAN-CLAD:
course_id | teachclass_count | video_records | exam_records | homework_records | student_records | attendance_records |
---|---|---|---|---|---|---|
数据挖掘 | 1 | 0 | 0 | 335 | 957 | 0 |
人工智能基础与科学探索实践-郑凯 | 2 | 0 | 0 | 0 | 57 | 0 |
大数据与人工智能 | 1 | 0 | 0 | 12 | 0 | 171 |
专业英语 | 1 | 0 | 0 | 244 | 558 | 0 |
Python编程基础 | 4 | 0 | 995 | 1916 | 6527 | 3205 |
专业英语-2025 | 1 | 0 | 0 | 0 | 284 | 0 |
人工智能与科学探索实践-陈优广 | 4 | 2345 | 2213 | 1028 | 8070 | 2411 |
AIGC在文化教学中的应用 | 3 | 86 | 0 | 4 | 195 | 0 |
数字媒体与AI创作实践 | 6 | 8878 | 0 | 19323 | 6485 | 4287 |
编程思维与实践(体育学院) | 3 | 0 | 289 | 1354 | 3873 | 303 |
汉语口语 | 1 | 0 | 0 | 0 | 0 | 0 |
HSK标准教程3 | 1 | 0 | 1 | 0 | 4 | 0 |
概率论与数理统计 | 3 | 207 | 0 | 2606 | 8662 | 0 |
计算机通识课 | 1 | 427 | 31 | 8 | 210 | 0 |
人工智能与智慧教育实践(体育学院) | 2 | 0 | 0 | 1780 | 3531 | 698 |
说汉语写汉字-tc | 1 | 0 | 0 | 0 | 0 | 0 |
学术英语读写课程 | 1 | 3 | 364 | 0 | 590 | 0 |
设计思维-拔尖班 | 1 | 0 | 0 | 0 | 0 | 51 |
数据与编程 | 1 | 31 | 10 | 0 | 85 | 0 |
数据科学与工程专题选讲 | 2 | 0 | 0 | 0 | 0 | 0 |
编程思维与实践(下)——探索数据的世界 | 1 | 0 | 0 | 0 | 42 | 461 |
开源软件通识基础 | 1 | 28 | 0 | 0 | 10 | 0 |
计算机与程序设计基础(D) | 7 | 0 | 0 | 0 | 0 | 118 |
neXt-lab的机器学习 | 1 | 0 | 0 | 0 | 74 | 17 |
人工智能基础与应用 | 32 | 15101 | 590 | 951 | 19192 | 3041 |
编程思维与实践(理科组)(陈优广) | 2 | 968 | 1110 | 1193 | 3997 | 2012 |
程序设计 | 9 | 6384 | 127 | 0 | 2856 | 0 |
计算机系统 | 4 | 0 | 0 | 217 | 791 | 0 |
数据挖掘-2025春 | 1 | 0 | 0 | 320 | 1682 | 0 |
编译原理 | 2 | 0 | 0 | 0 | 1258 | 0 |
2022年人工智能初探 | 2 | 630 | 61 | 87 | 556 | 536 |
说汉语3 | 1 | 0 | 0 | 0 | 0 | 0 |
事业启航-数据学院 | 2 | 42 | 0 | 0 | 614 | 0 |
AI赋能文化教学-汉语教师志愿者培训 | 1 | 1 | 0 | 0 | 23 | 0 |
当代数据管理系统 | 10 | 74935 | 3003 | 2531 | 66874 | 36 |
社会计算 | 3 | 0 | 50 | 55 | 383 | 0 |
人工智能创意编程之 python | 1 | 0 | 0 | 0 | 3 | 0 |
统计方法与机器学习 | 3 | 1946 | 0 | 1305 | 4230 | 0 |
云计算系统 | 1 | 0 | 0 | 0 | 0 | 0 |
XX | 1 | 0 | 0 | 0 | 0 | 0 |
计算机网络(拔尖基地) | 1 | 0 | 0 | 151 | 535 | 0 |
计算机文化与数字胜任力 | 4 | 0 | 50 | 0 | 54489 | 0 |
编程思维与实践(理科组) | 56 | 36946 | 4284 | 2213 | 41684 | 13849 |
编程思维与实践(2021) | 4 | 1322 | 9 | 0 | 453 | 0 |
开源软件设计与开发(本科生) | 1 | 4 | 6 | 0 | 4 | 20 |
数据科学与数据智能实践 | 37 | 9929 | 9360 | 8945 | 34612 | 13835 |
编程思维与实践 | 31 | 31846 | 3894 | 11463 | 48271 | 12970 |
应用编程实践 | 2 | 0 | 30 | 43 | 438 | 620 |
数据系统前沿 | 1 | 0 | 0 | 0 | 4 | 0 |
2G-语言课 | 1 | 0 | 0 | 0 | 0 | 0 |
数据伦理 | 8 | 0 | 0 | 2466 | 1625 | 213 |
多媒体技术与应用(HA&M) | 7 | 0 | 281 | 4096 | 7544 | 1674 |
机器学习 | 2 | 0 | 0 | 0 | 0 | 0 |
高性能计算与并行计算 | 1 | 0 | 0 | 45 | 1 | 42 |
信息系统与数字社会 | 1 | 11 | 15 | 0 | 47 | 0 |
编程思维与实践(数字媒体) | 6 | 0 | 0 | 6609 | 3537 | 2885 |
Parliamo Cinese我们说汉语(1)(天池) | 1 | 0 | 0 | 0 | 5 | 0 |
机器学习(2024) | 1 | 0 | 0 | 0 | 0 | 684 |
Python语言程序设计 | 1 | 0 | 1 | 0 | 1 | 0 |
数字媒体与交互设计 | 8 | 0 | 0 | 0 | 3 | 771 |
算法与人工智能 | 1 | 0 | 12 | 0 | 52 | 0 |
分布式计算系统 | 3 | 28895 | 0 | 0 | 22108 | 0 |
2022高校学生人工智能训练营(英特尔-华师大) | 1 | 0 | 0 | 119 | 2 | 0 |
计算机视觉 | 1 | 0 | 0 | 0 | 0 | 0 |
程序设计(计算机拔尖基地) | 5 | 425 | 536 | 1780 | 4738 | 1924 |
程序优化系统设计(上) | 1 | 4 | 0 | 0 | 63 | 0 |
2022“人工智能”教学研讨班 | 1 | 322 | 0 | 0 | 459 | 0 |
人工智能与科学探索(蒲鹏) | 3 | 712 | 557 | 939 | 6818 | 3048 |
数据科学与工程算法基础 | 1 | 100 | 0 | 0 | 514 | 0 |
人工智能与科学探索实践-朱晴婷 | 7 | 1070 | 389 | 523 | 2915 | 5453 |
Parliamo Cinese我们说汉语(1) | 1 | 0 | 1 | 3 | 17 | 2 |
人工智能与科学探索实践 | 28 | 7710 | 984 | 0 | 16942 | 7536 |
计算机科学中的伟大思想 | 2 | 0 | 41 | 2 | 167 | 283 |
编程思维与实践(微专业) | 1 | 51 | 0 | 0 | 48 | 0 |
数据结构(拔尖基地) | 2 | 0 | 0 | 44 | 1925 | 0 |
软件系统优化 | 5 | 176 | 0 | 1020 | 2515 | 0 |
数据思维与实践 | 1 | 962 | 0 | 0 | 364 | 0 |
数字素养 | 1 | 3 | 0 | 0 | 1 | 83 |
我们说汉语3 | 2 | 0 | 0 | 0 | 0 | 0 |
计算机视觉(2024) | 1 | 0 | 0 | 0 | 0 | 0 |
统计与机器学习(非全) | 3 | 112 | 0 | 388 | 1214 | 0 |
事业启航 | 2 | 174 | 0 | 23 | 51 | 0 |
数据科学与工程数学基础 | 1 | 123 | 0 | 0 | 35 | 0 |
数据学院2022年双创展示 | 1 | 0 | 0 | 0 | 2 | 0 |
数据分析与大数据 | 18 | 2495 | 250 | 0 | 2809 | 1704 |
人类思维与学科史论-计算机 | 1 | 0 | 0 | 0 | 0 | 0 |
网络与数字安全 | 1 | 0 | 10 | 0 | 29 | 0 |
区块链系统 | 1 | 0 | 0 | 0 | 5 | 0 |
计算机系统(拔尖基地) | 3 | 377 | 0 | 0 | 2627 | 0 |
编程思维与实践(理科组)(刘小平) | 3 | 245 | 0 | 0 | 826 | 1729 |
web编程 | 3 | 0 | 0 | 833 | 5328 | 612 |
Metasequoia Cup Coding Competition | 4 | 0 | 0 | 0 | 0 | 0 |
开源软件开发与社区治理(研究生) | 1 | 21 | 55 | 0 | 20 | 924 |
当代人工智能 | 1 | 0 | 154 | 0 | 1 | 0 |
说汉语写汉字 — 第13课 现在几点 | 1 | 0 | 0 | 0 | 0 | 0 |
算法基础 | 1 | 0 | 0 | 0 | 0 | 0 |
计算教育学2022 | 1 | 0 | 14 | 27 | 1 | 140 |
程序优化系统设计(下) | 1 | 0 | 0 | 0 | 10 | 0 |
云计算应用与开发 | 2 | 0 | 0 | 0 | 14 | 0 |
人工智能基础(上海市重点课程) | 3 | 358 | 185 | 212 | 1643 | 524 |
数据思维与实践(2024) | 3 | 472 | 52 | 144 | 1109 | 498 |
软件开发管理与实践 | 1 | 0 | 0 | 5 | 0 | 0 |
学中文 | 1 | 0 | 0 | 0 | 0 | 0 |
人工智能与数学 | 1 | 4 | 0 | 0 | 5 | 0 |
设计思维 | 1 | 0 | 0 | 0 | 6 | 0 |
水杉公益 | 2 | 0 | 1 | 0 | 0 | 0 |
计算机编程语言 | 1 | 0 | 45 | 0 | 0 | 115 |
计算机视觉与多媒体信息处理 | 5 | 6564 | 0 | 0 | 4976 | 0 |
青少年编程教育训练营 | 6 | 0 | 0 | 70 | 292 | 329 |
数据科学与工程导论 | 6 | 524 | 39 | 865 | 4023 | 129 |
parla e scrivi | 1 | 0 | 0 | 0 | 0 | 0 |
人工智能初探 | 2 | 395 | 141 | 1 | 340 | 0 |
(B2)编程思维与实践 | 22 | 0 | 115 | 0 | 170 | 2609 |
Total Count: 113 | 471 | 244364 | 30355 | 78298 | 421110 | 92552 |
Acknowledgements
Special thanks to East China Normal University, College of Data Science and Engineering of East China Normal University, and Metasequoia Online for their support and help in this project.