What makes GPT-3 and Dalle powerful is exactly the same thing: Data.

 

Data is crucial in our field, and our models are extremely data-hungry. These large models, either language models for GPT or image models for Dalle, all require the same thing: way too much data.

 

The more data you have, the better it is. So you need to scale up those models, especially for real-world applications. 

Bigger models can use bigger datasets to improve only if the data is of high quality. 

Feeding images that do not represent the real world will be of no use and even worsen the model’s ability to generalize. This is where data-centric AI comes into play... 

Learn more in the video:

 

References

►Read the full article: https://www.louisbouchard.ai/data-centric-ai/
►Data-centric AI: https://snorkel.ai/data-centric-ai
►Weak supervision: https://snorkel.ai/weak-supervision/
►Programmatic labeling: https://snorkel.ai/programmatic-labeling/
►Curated list of resources for Data-centric AI: https://github.com/hazyresearch/data-centric-ai
►Learn more about Snorkel: https://snorkel.ai/company/
►From Model-centric to Data-centric AI - Andrew Ng:


►Software 2.0: https://hazyresearch.stanford.edu/blog/2020-02-28-software2
►Paper 1: Ratner, A.J., De Sa, C.M., Wu, S., Selsam, D. and Ré, C.,
2016. Data programming: Creating large training sets, quickly. Advances
in neural information processing systems, 29.
►Paper 2: Ratner, A., Bach, S.H., Ehrenberg, H., Fries, J., Wu, S. and
Ré, C., 2017, November. Snorkel: Rapid training data creation with weak
supervision. In Proceedings of the VLDB Endowment. International
Conference on Very Large Data Bases (Vol. 11, No. 3, p. 269). NIH Public
Access.
►Paper 3: Ré, C. (2018). Software 2.0 and Snorkel: Beyond Hand-Labeled
Data. Proceedings of the 24th ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining.
►My Newsletter (A new AI application explained weekly to your emails!): https://www.louisbouchard.ai/newsletter/

#artificialintelligence #ai #machinelearning #technology #datascience #python #deeplearning #programming #tech #robotics #innovation #bigdata #coding #iot #computerscience #data #dataanalytics #business #engineering #robot #datascientist #art #software #automation #analytics #ml #pythonprogramming #programmer #digitaltransformation #developer #science #robots #coder #artificialintelligenceai #cybersecurity #java #javascript #future #digital #datavisualization #neuralnetworks #blockchain #digitalmarketing #raspberrypi #electronics #webdevelopment #marketing #html #startup #digitalart #dataanalysis #arduino #android #internetofthings #computervision #css #design #bhfyp #chatbot #codinglife