9/25/2020

Statistical Modeling: The Two Cultures

Leo Breiman, Statistical Modeling: The Two CulturesStatistical Science, 2001, Vol. 16, No. 3, 199–231.

There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.

9/18/2020

Introducing data science

Davy Cielen, Arno D. B. Meysman, and Mohamed Ali, Introducing data science: Big data, machine learning, and more, using Python tools, Manning, May 2016.

Introducing Data Science explains vital data science concepts and teaches you how to accomplish the fundamental tasks that occupy data scientists. You'll explore data visualization, graph databases, the use of NoSQL, and the data science process. You'll use the Python language and common Python libraries as you experience firsthand the challenges of dealing with data at scale. Discover how Python allows you to gain insights from data sets so big that they need to be stored on multiple machines, or from data moving so quickly that no single machine can handle it. This book gives you hands-on experience with the most popular Python data science libraries, Scikit-learn and StatsModels. After reading this book, you'll have the solid foundation you need to start a career in data science.

9/17/2020

Data Science in Production: Building Scalable Model Pipelines with Python

Ben Weber, Data Science in Production: Building Scalable Model Pipelines with Python, Independently published, 2020.

Putting predictive models into production is one of the most direct ways that data scientists can add value to an organization. By learning how to build and deploy scalable model pipelines, data scientists can own more of the model production process and more rapidly deliver data products.

9/13/2020

Decisive actions to emerge stronger in the next normal

Kevin Sneader, Shubham Singhal, and Bob Sternfels, What now? Decisive actions to emerge stronger in the next normal, McKinsey & Company, September 2020 (pdf)

  1. Think of the return as a muscle
  2. Focus on high-impact actions
  3. Rebuild for speed
  4. Reimagine the workforce from the top down
  5. Make bold portfolio moves
  6. Reset technology plans
  7. Rethink the global footprint
  8. Take the lead on climate and sustainability
  9. Think about the role of regulation and government
  10. Make purpose part of everything

9/11/2020

台灣引興怎麼堅持豐田管理的零庫存

曾如瑩、管婺媛,工具機天王遇斷鏈潮,怎麼堅持豐田管理的零庫存,商業周刊,2020 年 08 月 31 日

邱奕嘉問(以下簡稱邱):每次有重大天災,精實管理就會被拿出來檢討。因為低庫存者斷鏈,造成損失,反倒有庫存者,業績成長。經過這次疫情,你認為零庫存概念是否應該調整?

王慶華答(以下簡稱王):豐田汽車當然(曾經)因為天災而斷鏈,但它也是全世界恢復最快的。企業講究長期利益,不是講短期利益的。斷貨(鏈)是事實沒錯,有存貨者,可能在疫情爆發這 3 個月中活得比別人好,但就是贏這 3 個月,之後呢?即便賺到比別人多一倍的利潤,也只有短期,庫存總會用完。

9/10/2020

Linear Algebra and Optimization for Machine Learning: A Textbook

Charu C. Aggarwal, Linear Algebra and Optimization for Machine Learning: A Textbook, Springer, 1st ed, 2020.

This textbook introduces linear algebra and optimization in the context of machine learning. Examples and exercises are provided throughout the book. A solution manual for the exercises at the end of each chapter is available to teaching instructors. This textbook targets graduate level students and professors in computer science, mathematics and data science. Advanced undergraduate students can also use this textbook. The chapters for this textbook are organized as follows: 

9/07/2020

The New Business of AI (and How It’s Different From Traditional Software)

 Martin Casado and Matt Bornstein, The New Business of AI (and How It’s Different From Traditional Software), a16z.com, February 16, 2020.

We are huge believers in the power of AI to transform business: We’ve put our money behind that thesis, and we will continue to invest heavily in both applied AI companies and AI infrastructure. However, we have noticed in many cases that AI companies simply don’t have the same economic construction as software businesses. At times, they can even look more like traditional services companies. In particular, many AI companies have:

  1. Lower gross margins due to heavy cloud infrastructure usage and ongoing human support;
  2. Scaling challenges due to the thorny problem of edge cases;
  3. Weaker defensive moats due to the commoditization of AI models and challenges with data network effects....

 Building, scaling, and defending great AI companies – practical advice for founders 

  • Eliminate model complexity as much as possible. 
  • Choose problem domains carefully – and often narrowly – to reduce data complexity. 
  • Plan for high variable costs.
  • Embrace services.
  • Plan for change in the tech stack. 
  • Build defensibility the old-fashioned way. 

9/01/2020

英業達使用 5G 讓 AI 取代人眼審查

王郁倫,光學檢測聰明10倍!直擊英業達伺服器基地,5G如何讓AI取代人眼審查?,數位時代,2020.08.28

2020年8月中,英業達架設起5G企業專網,利用上傳100Mbps及下載1Gbps的高速,串連生產車間上的AOI(自動光學檢測,Automated Optical Inspection)系統,不僅人力安排減少9成,產線直通率(FPY, First Pass Yield)更拉高至85%。...

差別在於,AOI 告別「單站」智慧,改採「集中」智慧。

AI在學習分辨產品瑕疵時,必須累積足夠多的掃描圖檔,才能精準判讀。5G導入前,每條產線AOI各別利用旁邊的電腦運算判斷,搜集的圖庫也只來自該產線;導入5G後,10條產線AOI,都直接把圖檔上傳雲端集中判斷,數據充足精準度自然提升。

英業達企業電腦事業群全球營運中心副總閻承隆解釋,因為Wi-Fi不穩定,過去智慧工廠每條產線自建AI學習,AOI掃描產品後的圖檔送到產線伺服器運算判別良劣,圖片不夠多精準度就低,要經過不斷學習,精準度才拉高到98%。