Large Vision-language Models: Pre-training, Prompting and Applications - Rilegato

9783031949685: Large Vision-language Models: Pre-training, Prompting and Applications

Rilegato

ISBN 10: 3031949684 ISBN 13: 9783031949685

Casa editrice: Springer-Nature New York Inc, 2025

Vedi tutte le copie di questa edizione con ISBN

3 Usato

Da: EUR 92,50

14 Nuovo

Da: EUR 150,28

The rapid progress in the field of large multimodal foundation models, especially vision-language models, has dramatically transformed the landscape of machine learning, computer vision, and natural language processing. These powerful models, trained on vast amounts of multimodal data mixed with images and text, have demonstrated remarkable capabilities in tasks ranging from image classification and object detection to visual content generation and question answering. This book provides a comprehensive and up-to-date exploration of large vision-language models, covering the key aspects of their pre-training, prompting techniques, and diverse real-world computer vision applications. It is an essential resource for researchers, practitioners, and students in the fields of computer vision, natural language processing, and artificial intelligence.

Large Vision-Language Models begins by exploring the fundamentals of large vision-language models, covering architectural designs, training techniques, and dataset construction methods. It then examines prompting strategies and other adaptation methods, demonstrating how these models can be effectively fine-tuned to address a wide range of downstream tasks. The final section focuses on the application of vision-language models across various domains, including open-vocabulary object detection, 3D point cloud processing, and text-driven visual content generation and manipulation.

Beyond the technical foundations, the book explores the wide-ranging applications of vision-language models (VLMs), from enhancing image recognition systems to enabling sophisticated visual content generation and facilitating more natural human-machine interactions. It also addresses key challenges in the field, such as feature alignment, scalability, data requirements, and evaluation metrics. By providing a comprehensive roadmap for both newcomers and experts, this book serves as a valuable resource for understanding the current landscape, limitations, and future directions of VLMs, ultimately contributing to the advancement of artificial intelligence.

Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.

Informazioni sull'autore

Kaiyang Zhou is an Assistant Professor at the Department of Computer Science, Hong Kong Baptist University, working on computer vision and machine learning. He has published more than 30 technical papers in top-tier journals and conferences in relevant fields, including CVPR, ICCV, ECCV, NeurlPS, ICLR, ICML, AAAI, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), and International Journal of Computer Vision (IJCV), with over 10,000 citations received in total. He is an Associate Editor of IJCV, the flagship journal in computer vision, and regularly serves as area chair and senior program committee for top-tier computer vision and machine learning conferences, such as NeurIPS, CVPR, ECCV, and AAAI.

Ziwei Liu is an Associate Professor at Nanyang Technological University, Singapore. His research interests include computer vision, machine learning, and computer graphics. He has published extensively with top-tier conferences and journals in relevant fields, including CVPR, ICCV, ECCV, NeurlPS, ICLR, ICML, IEEE Transactions on Pattern Analysis and Machine Intelligence, ACM Transactions on Graphics and Nature - Machine Intelligence. He is the recipient of ICCV Young Researcher Award, HKSTP Best Paper Award, CVPR Best Paper Award Candidate, ICBS Frontiers of Science Award and MIT Technology Review Innovators under 35 Asia Pacific. He serves as an area chair of CVPR, ICCV, ECCV, NeurlPS and ICLR, as well as an associate editor of International Journal of Computer Vision.

Peng Gao is a research scientist at Shanghai Artificial Intelligence Laboratory, working on large language models and vision-language models. His research interests include vision-language models, large language models and diffusion models for contents creation. He has published more than 40 papers in top-tier journals and conferences, including International Journal of Computer Vision (IJCV), ICML, ICLR, NeurIPS, CVPR, ICCV and ECCV, receiving more than 10,000 citations. He has led several influential open-source projects including LLaMa-Adapter and the Lumina series, receiving more than 7000 and 2000 stars, respectively.

Dalla quarta di copertina

Le informazioni nella sezione "Su questo libro" possono far riferimento a edizioni diverse di questo titolo.

Editore: Springer-Nature New York Inc
Data di pubblicazione: 2025
Lingua: Inglese
ISBN 10: 3031949684
ISBN 13: 9783031949685
Rilegatura: Copertina rigida
Numero di pagine: 420
Redattore: Zhou Kaiyang, Liu Ziwei, Gao Peng
Contatto del produttore: ProductSafety@springernature.com
ProductSafety@springernature.com

Poststr. 9
Darmstadt
64293
Germania

Compra usato

Condizioni: come nuovo

Zustandsbeschreibung: leichte Lagerspuren...

Visualizza questo articolo

EUR 92,50

Spedizione EUR 50,00
Spedito da Germania a U.S.A.

Aggiungi al carrello

Compra nuovo

Visualizza questo articolo

EUR 150,28

Spedizione EUR 5,50
Spedito da Italia a U.S.A.

Aggiungi al carrello

Risultati della ricerca per Large Vision-language Models: Pre-training, Prompting...

Foto dell'editore

Large Vision-Language Models

Editore: Springer, 2026

ISBN 10: 3031949684 ISBN 13: 9783031949685

Antico o usato Rilegato

Da: SKULIMA Wiss. Versandbuchhandlung, Westhofen, Germania

Valutazione del venditore 5 su 5 stelle

Condizione: Wie Neu. Zustandsbeschreibung: leichte Lagerspuren/minor shelfwear. Pre-training, Prompting, and Applications. Edited by Kaiyang Zhou, Ziwei Liu and Peng Gao. This book provides a comprehensive and up-to-date exploration of large vision-language models, covering the key aspects of their pre-training, prompting techniques, and diverse real-world computer vision applications. Large Vision-Language Models begins by exploring the fundamentals of large vision-language models, covering architectural designs, training techniques, and dataset construction methods. It then examines prompting strategies and other adaptation methods, demonstrating how these models can be effectively fine-tuned to address a wide range of downstream tasks. The final section focuses on the application of vision-language models across various domains, including open-vocabulary object detection, 3D point cloud processing, and text-driven visual content generation and manipulation. Beyond the technical foundations, the book explores the wide-ranging applications of vision-language models (VLMs), from enhancing image recognition systems to enabling sophisticated visual content generation and facilitating more natural human-machine interactions. It also addresses key challenges in the field, such as feature alignment, scalability, data requirements, and evaluation metrics. By providing a comprehensive roadmap for both newcomers and experts, this book serves as a valuable resource for understanding the current landscape, limitations, and future directions of VLMs, ultimately contributing to the advancement of artificial intelligence. XVII,429 Seiten mit zahlreichen Farbabb. und einigen Tab., gebunden (Advances in Computer Vision and Pattern Recognition/Springer Verlag 2026 [sic! recte: 2025]). Statt EUR 192,59. Gewicht: 839 g - Gebunden/Gebundene Ausgabe. Codice articolo 130950

Contatta il venditore

Compra usato

EUR 92,50

Spedizione EUR 50,00
Spedito da Germania a U.S.A.

Quantit�: 1 disponibili

Aggiungi al carrello

Foto dell'editore

Large Vision-Language Models (eng)

Zhou, Kaiyang

Editore: Springer, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Nuovo Rilegato

Print on Demand

Da: Brook Bookstore On Demand, Napoli, NA, Italia

Valutazione del venditore 5 su 5 stelle

Condizione: new. Questo � un articolo print on demand. Codice articolo XAQ77XPGKF

Contatta il venditore

Compra nuovo

EUR 150,28

Spedizione EUR 5,50
Spedito da Italia a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Large Vision-Language Models

Editore: Springer Verlag GmbH, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Nuovo Rilegato

Print on Demand

Da: moluna, Greven, Germania

Valutazione del venditore 5 su 5 stelle

Gebunden. Condizione: New. Dieser Artikel ist ein Print on Demand Artikel und wird nach Ihrer Bestellung fuer Sie gedruckt. Codice articolo 2381757551

Contatta il venditore

Compra nuovo

EUR 158,41

Spedizione EUR 48,99
Spedito da Germania a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Large Vision-Language Models : Pre-Training, Prompting and Applications

Zhou, Kaiyang (EDT); Liu, Ziwei (EDT); Gao, Peng (EDT)

Editore: Springer, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Nuovo Rilegato

Da: GreatBookPricesUK, Woodford Green, Regno Unito

Valutazione del venditore 5 su 5 stelle

Condizione: New. Codice articolo 51229439-n

Contatta il venditore

Compra nuovo

EUR 193,70

Spedizione EUR 17,51
Spedito da Regno Unito a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Large Vision-Language Models

Kaiyang Zhou

Editore: Springer, Springer Aug 2025, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Nuovo Rilegato

Print on Demand

Da: BuchWeltWeit Ludwig Meier e.K., Bergisch Gladbach, Germania

Valutazione del venditore 5 su 5 stelle

Buch. Condizione: Neu. This item is printed on demand - it takes 3-4 days longer - Neuware -The rapid progress in the field of large multimodal foundation models, especially vision-language models, has dramatically transformed the landscape of machine learning, computer vision, and natural language processing. These powerful models, trained on vast amounts of multimodal data mixed with images and text, have demonstrated remarkable capabilities in tasks ranging from image classification and object detection to visual content generation and question answering. This book provides a comprehensive and up-to-date exploration of large vision-language models, covering the key aspects of their pre-training, prompting techniques, and diverse real-world computer vision applications. It is an essential resource for researchers, practitioners, and students in the fields of computer vision, natural language processing, and artificial intelligence.Large Vision-Language Models begins by exploring the fundamentals of large vision-language models, covering architectural designs, training techniques, and dataset construction methods. It then examines prompting strategies and other adaptation methods, demonstrating how these models can be effectively fine-tuned to address a wide range of downstream tasks. The final section focuses on the application of vision-language models across various domains, including open-vocabulary object detection, 3D point cloud processing, and text-driven visual content generation and manipulation.Beyond the technical foundations, the book explores the wide-ranging applications of vision-language models (VLMs), from enhancing image recognition systems to enabling sophisticated visual content generation and facilitating more natural human-machine interactions. It also addresses key challenges in the field, such as feature alignment, scalability, data requirements, and evaluation metrics. By providing a comprehensive roadmap for both newcomers and experts, this book serves as a valuable resource for understanding the current landscape, limitations, and future directions of VLMs, ultimately contributing to the advancement of artificial intelligence. 448 pp. Englisch. Codice articolo 9783031949685

Contatta il venditore

Compra nuovo

EUR 192,59

Spedizione EUR 23,00
Spedito da Germania a U.S.A.

Quantit�: 2 disponibili

Aggiungi al carrello

Foto dell'editore

Large Vision-Language Models : Pre-Training, Prompting and Applications

Zhou, Kaiyang (EDT); Liu, Ziwei (EDT); Gao, Peng (EDT)

Editore: Springer, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Nuovo Rilegato

Da: GreatBookPrices, Columbia, MD, U.S.A.

Valutazione del venditore 5 su 5 stelle

Condizione: New. Codice articolo 51229439-n

Contatta il venditore

Compra nuovo

EUR 216,14

Spedizione EUR 2,31
Spedito in U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Large Vision-Language Models : Pre-Training, Prompting and Applications

Zhou, Kaiyang (EDT); Liu, Ziwei (EDT); Gao, Peng (EDT)

Editore: Springer, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Antico o usato Rilegato

Da: GreatBookPrices, Columbia, MD, U.S.A.

Valutazione del venditore 5 su 5 stelle

Condizione: As New. Unread book in perfect condition. Codice articolo 51229439

Contatta il venditore

Compra usato

EUR 226,37

Spedizione EUR 2,31
Spedito in U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Large Vision-Language Models : Pre-Training, Prompting and Applications

Zhou, Kaiyang (EDT); Liu, Ziwei (EDT); Gao, Peng (EDT)

Editore: Springer, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Antico o usato Rilegato

Da: GreatBookPricesUK, Woodford Green, Regno Unito

Valutazione del venditore 5 su 5 stelle

Condizione: As New. Unread book in perfect condition. Codice articolo 51229439

Contatta il venditore

Compra usato

EUR 226,28

Spedizione EUR 17,51
Spedito da Regno Unito a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Large Vision-Language Models

Zhou, Kaiyang

Editore: Springer International Publishing AG, CH, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Nuovo Rilegato

Da: Rarewaves.com USA, London, LONDO, Regno Unito

Valutazione del venditore 5 su 5 stelle

Hardback. Condizione: New. Codice articolo LU-9783031949685

Contatta il venditore

Compra nuovo

EUR 252,02

Spedizione gratuita
Spedito da Regno Unito a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Large Vision-Language Models

Kaiyang Zhou

Editore: Springer, Springer Aug 2025, 2025

ISBN 10: 3031949684 ISBN 13: 9783031949685

Nuovo Rilegato

Print on Demand

Da: buchversandmimpf2000, Emtmannsberg, BAYE, Germania

Valutazione del venditore 5 su 5 stelle

Buch. Condizione: Neu. This item is printed on demand - Print on Demand Titel. Neuware -Part 1: Pre-training and Datasets.- Chapter 1: LAION-5B: A Massive Open Image-Text Dataset.- Chapter 2: Efficient Training of Large-Scale Vision-Language Models.- Chapter 3: Scaling Laws for Contrastive Language-Image Learning.- Chapter 4: Scaling Up Vision-Language Models for Generic Tasks.- Chapter 5: Searching for Next-Gen Multimodal Datasets.- Part 2: Prompting and Generalization.- Chapter 6: Soft Prompt Learning for Vision-Language Models.- Chapter 7: Unified Prompting for Vision and Language.- Chapter 8: Zero-Shot Image Classification with Custom Prompts.- Chapter 9: Enhancing Vision-Language Models with Feature Adapters.- Chapter 10: Automatic Optimization of Prompting Architectures.- Chapter 11: Open-Vocabulary Calibration for VL Models.- Part 3: Applications.- Chapter 12: Open-Vocabulary DETR with Conditional Matching.- Chapter 13: Extracting Dense Labels from CLIP.- Chapter 14: PointCLIP: Understanding Point Clouds with VL.- Chapter 15: Diffusion-Based Relation Inversion from Images.- Chapter 16: Text-to-Video Generation.- Chapter 17: Text-Driven Human Motion Generation.- Chapter 18: Zero-Shot Text-Driven 3D Avatar Generation.- Chapter 19: Zero-Shot Text-Driven HDR Panorama Generation.Springer-Verlag KG, Sachsenplatz 4-6, 1201 Wien 448 pp. Englisch. Codice articolo 9783031949685

Contatta il venditore

Compra nuovo

EUR 192,59

Spedizione EUR 60,00
Spedito da Germania a U.S.A.

Quantit�: 1 disponibili

Aggiungi al carrello

Vedi altre 7 copie di questo libro

Vedi tutti i risultati per questo libro

Large Vision-language Models: Pre-training, Prompting and Applications - Rilegato

Sinossi

Informazioni sull'autore

Dalla quarta di copertina

Risultati della ricerca per Large Vision-language Models: Pre-training, Prompting...

Compra usato

Compra nuovo

Compra nuovo

Compra nuovo

Compra nuovo

Compra nuovo

Compra usato

Compra usato

Compra nuovo

Compra nuovo

Vedi altre 7 copie di questo libro