5/17/2024

專題和論文的製作與報告 (tips for the final project and your thesis)

  • (at the bottom) Avoid common phenomena, final written report (for your ppt content)
  • (大學生) 重要任務
    • 人生困境:
      • Tal Ben-Shahar, Happier: Learn the Secrets to Daily Joy and Lasting Fulfillment, McGraw Hill, 2007. (譚家瑜譯,更快樂:哈佛最受歡迎的一堂課,天下雜誌,2012)
    • 良好的專題和論文,學習「如何學習」的能力以解決未知問題。
  • 心態和選隊友 (mindset and teammates)
  • 選題目 (choose a problem)
    • Survey articles: ACM Computing Surveys, EJOR
    • Possible approaches: 
      • Find a dataset you are interested in and apply different machine learning algorithms to compare their performance indexes. 
      • Reproduce the results in a paper
      • Read and critique a paper
      • Use the method in a paper to solve a problem you propose 
    • Google scholar (search keywords, authors): 
      • Classical paper with a higher citation number.
      • Use the papers which cite the article to understand its progress
    • arxiv (for newer research)
    • Google search "top machine learning conferences": ICLR, ICML, NeurIPS, ...
      • Best paper award
      • Trace, re-produce, and modify their source codes
  • Preparation
    • 作業研究和機器學習的領域知識 (domain knowledge):修課 (courses),看報章雜誌 (newspapers and magazines),期刊論文 (journal articles),生活中的觀察 (daily experience),興趣 (interest)
      • 數位時代、商業周刊、天下、哈佛商業評論
      • Economist, Harvard Business Review, New York Times, Wall Street Journal 
      • Nice information (by subscription): (Daily) Inside AI  at ai@inside.com, (Daily) Inside Tech at tech@inside.com, (Weekly) thebatch@deeplearning.ai
  • 報告 (presentation)

    • It is seldom to present the codes in a lecture/seminar, so my presentation should not be taken as a reference and the codes are used to help the students. Please refer to the following examples for your reference.  You might include more detailed results/codes as backup slides in case the audients ask any related questions.  
  • Writing and report
    • William Strunk Jr. and E.B. White, The Elements of Style, Pearson, 1999. (The best.)
    • Final written report: (Thanks to Prof. Ping-Shun Chen (陳平舜) for the reference files.)
      • 請使用 INFORMS 引用方法 (Please use the citation method for INFORMS.)
        • If you use certain material from the Internet without citation, it is treated as plagiarism (抄襲).
      • Problem statement, background knowledge, the math models to solve the problems, and the (potential) real-world applications or academic contributions
      • If you read a paper with Python codes, 3 steps to learn: 
        • Reproduce the results, e.g., a table or picture, in the paper 
        • Tune the hyper-parameters or parameters, e.g., the mutation rate of genetic algorithm, learning rate in machine learning algorithms  
          • Comparison: Computational time, solution quality (performance) by using a baseline or an oracle
        • Rewrite some modules or functions, e.g., replace the crossover module with another approach
      • If operations research
        • Problem to be solved? 
        • Mathematical model: Decision variables, objective function and its interpretation of each term, the constraints and their types/meanings 
        • Software solvers and algorithm development: Python code or Excel solution, What is the optimal/suboptimal policy? Intuitive or greedy policies, duality gap, interpretation, sensitivity analysis and interpretation
        • Conclusion, future works and research
      • If machine learning
        • Data exploration: Mention any special things in preprocessing, Important features (correlation coefficients, Lasso, or even better by using best subset or holistic approach, feature_importances_ from random forests), insights
        • Model structure (feedforward or the others, and node numbers or size the convolution filter if neural network), feature engineering, hyperparameters in the training algorithm (besides the default values), and the validation process (or cross-validation)
          • Please use at least ensemble trees and neural networks, the most winning methods in the Kaggle competitions. 
          • We only teach the most fundamental way by using grid search. Please search "hyperparameter optimization algorithms" under Google Scholar for more advanced research results. 
          • The improvement always exists. The amazing paper Stable Regression demonstrated that we could even improve the performance by using optimization-based splitting instead of random splitting. 
        • Performance comparison: Different algorithms (with the optimal hyperparameters), training and testing performance indexes (ROC or R^2), insights and your observations
        • Conclusion, future works and research 
      • Work assignment for each team member
        • Summary of your learning (小組心得)
        • Please indicate the assignment of the final report of the team members. For students who have not done anything, the final project will be given zero points for the written report and presentation. 請註明小組成員的期末報告工作分配,沒有做事的同學,期末專題將會零分(書面報告和上台報告)
      • If requested, the written report including the cover should not exceed 15 pages, each page is a single column, the font of the content is 12 points, 1.5 times for the line spacing, and the title is not limited in size. 書面報告含封面請勿超過15頁,每頁為單欄,內文為12級字,標楷體和 Times New Roman,1.5 倍行距,標題不限大小。
      • Please upload your pdf file to cycu ilearn one day before the last class meeting time.

沒有留言:

張貼留言