Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More - Brossura

Poisson, Peter E.

9798294338459: Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Brossura

ISBN 13: 9798294338459

Casa editrice: Independently published, 2025

Vedi tutte le copie di questa edizione con ISBN

2 Usato

Da: EUR 18,59

6 Nuovo

Da: EUR 19,00

Are you struggling to scale your large language models (LLMs) without breaking the bank or sacrificing latency? This book offers a clear roadmap to optimize inference, reduce costs, and scale seamlessly across platforms like PyTorch, ONNX, vLLM, and more.

Optimizing LLM Performance is your hands-on guide to boosting the efficiency of large language models in production environments. Whether you’re building chatbots, document summarizers, or enterprise AI tools, this book teaches proven methods to accelerate inference while maintaining accuracy. It dives deep into hardware-aware optimizations, quantization, model pruning, compiler acceleration, and memory-efficient runtime strategies without locking you into any single framework.

Written with clarity and real-world use in mind, the book features practical case studies, side-by-side performance comparisons, and up-to-date techniques from the cutting edge of AI deployment. If you're building, serving, or scaling LLMs in 2025, this is the performance engineering guide you've been waiting for.

Key Features:
• Framework-agnostic optimization techniques using PyTorch, ONNX Runtime, vLLM, llama.cpp, and more
• Deep dive into quantization (INT8/4-bit), distillation, pruning, and KV caching
• Hands-on examples with FastAPI, Hugging Face Transformers, and serverless deployment
• Covers performance profiling, streaming, batching, and cost-efficient scaling
• Future-proof insights on compiler-aware models, LoRA 2.0, and edge inference

Ready to build LLM systems that are faster, cheaper, and more scalable?
Grab your copy of Optimizing LLM Performance today and deploy smarter.

Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.

Editore: Independently published
Data di pubblicazione: 2025
Lingua: Inglese
ISBN 13: 9798294338459
Rilegatura: Copertina flessibile
Numero di pagine: 163
Contatto del produttore: non disponibile
Persona responsabile: non disponibile

Compra usato

Condizioni: come nuovo

Unread book in perfect condition...

Visualizza questo articolo

EUR 18,59

Spedizione EUR 2,29
Spedito in U.S.A.

Aggiungi al carrello

Compra nuovo

Visualizza questo articolo

EUR 19,00

Spedizione EUR 2,29
Spedito in U.S.A.

Aggiungi al carrello

Risultati della ricerca per Optimizing LLM Performance: Framework-Agnostic Techniques...

Foto dell'editore

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Editore: Independently published, 2025

ISBN 13: 9798294338459

Antico o usato Brossura

Da: GreatBookPrices, Columbia, MD, U.S.A.

Valutazione del venditore 5 su 5 stelle

Condizione: As New. Unread book in perfect condition. Codice articolo 50955172

Contatta il venditore

Compra usato

EUR 18,59

Spedizione EUR 2,29
Spedito in U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Editore: Independently published, 2025

ISBN 13: 9798294338459

Nuovo Brossura

Da: GreatBookPrices, Columbia, MD, U.S.A.

Valutazione del venditore 5 su 5 stelle

Condizione: New. Codice articolo 50955172-n

Contatta il venditore

Compra nuovo

EUR 19,00

Spedizione EUR 2,29
Spedito in U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Optimizing LLM Performance

Poisson, Peter E.

Editore: Independently Published, 2025

ISBN 13: 9798294338459

Nuovo PAP

Da: PBShop.store US, Wood Dale, IL, U.S.A.

Valutazione del venditore 5 su 5 stelle

PAP. Condizione: New. New Book. Shipped from UK. Established seller since 2000. Codice articolo L2-9798294338459

Contatta il venditore

Compra nuovo

EUR 21,37

Spedizione gratuita
Spedito in U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Optimizing LLM Performance (Paperback)

Peter E. Poisson

Editore: Independently Published, 2025

ISBN 13: 9798294338459

Nuovo Paperback

Print on Demand

Da: Grand Eagle Retail, Bensenville, IL, U.S.A.

Valutazione del venditore 5 su 5 stelle

Paperback. Condizione: new. Paperback. Are you struggling to scale your large language models (LLMs) without breaking the bank or sacrificing latency? This book offers a clear roadmap to optimize inference, reduce costs, and scale seamlessly across platforms like PyTorch, ONNX, vLLM, and more.Optimizing LLM Performance is your hands-on guide to boosting the efficiency of large language models in production environments. Whether you're building chatbots, document summarizers, or enterprise AI tools, this book teaches proven methods to accelerate inference while maintaining accuracy. It dives deep into hardware-aware optimizations, quantization, model pruning, compiler acceleration, and memory-efficient runtime strategies without locking you into any single framework.Written with clarity and real-world use in mind, the book features practical case studies, side-by-side performance comparisons, and up-to-date techniques from the cutting edge of AI deployment. If you're building, serving, or scaling LLMs in 2025, this is the performance engineering guide you've been waiting for.Key Features: - Framework-agnostic optimization techniques using PyTorch, ONNX Runtime, vLLM, llama.cpp, and more- Deep dive into quantization (INT8/4-bit), distillation, pruning, and KV caching- Hands-on examples with FastAPI, Hugging Face Transformers, and serverless deployment- Covers performance profiling, streaming, batching, and cost-efficient scaling- Future-proof insights on compiler-aware models, LoRA 2.0, and edge inferenceReady to build LLM systems that are faster, cheaper, and more scalable?Grab your copy of Optimizing LLM Performance today and deploy smarter. This item is printed on demand. Shipping may be from multiple locations in the US or from the UK, depending on stock availability. Codice articolo 9798294338459

Contatta il venditore

Compra nuovo

EUR 21,94

Spedizione gratuita
Spedito in U.S.A.

Quantit�: 1 disponibili

Aggiungi al carrello

Foto dell'editore

Optimizing LLM Performance

Poisson, Peter E.

Editore: Independently Published, 2025

ISBN 13: 9798294338459

Nuovo PAP

Da: PBShop.store UK, Fairford, GLOS, Regno Unito

Valutazione del venditore 4 su 5 stelle

PAP. Condizione: New. New Book. Shipped from UK. Established seller since 2000. Codice articolo L2-9798294338459

Contatta il venditore

Compra nuovo

EUR 19,20

Spedizione EUR 4,82
Spedito da Regno Unito a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Editore: Independently published, 2025

ISBN 13: 9798294338459

Nuovo Brossura

Da: GreatBookPricesUK, Woodford Green, Regno Unito

Valutazione del venditore 5 su 5 stelle

Condizione: New. Codice articolo 50955172-n

Contatta il venditore

Compra nuovo

EUR 19,19

Spedizione EUR 17,38
Spedito da Regno Unito a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More

Poisson, Peter E.

Editore: Independently published, 2025

ISBN 13: 9798294338459

Antico o usato Brossura

Da: GreatBookPricesUK, Woodford Green, Regno Unito

Valutazione del venditore 5 su 5 stelle

Condizione: As New. Unread book in perfect condition. Codice articolo 50955172

Contatta il venditore

Compra usato

EUR 20,83

Spedizione EUR 17,38
Spedito da Regno Unito a U.S.A.

Quantit�: Pi� di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Optimizing LLM Performance (Paperback)

Peter E. Poisson

Editore: Independently Published, 2025

ISBN 13: 9798294338459

Nuovo Paperback

Print on Demand

Da: CitiRetail, Stevenage, Regno Unito

Valutazione del venditore 5 su 5 stelle

Paperback. Condizione: new. Paperback. Are you struggling to scale your large language models (LLMs) without breaking the bank or sacrificing latency? This book offers a clear roadmap to optimize inference, reduce costs, and scale seamlessly across platforms like PyTorch, ONNX, vLLM, and more.Optimizing LLM Performance is your hands-on guide to boosting the efficiency of large language models in production environments. Whether you're building chatbots, document summarizers, or enterprise AI tools, this book teaches proven methods to accelerate inference while maintaining accuracy. It dives deep into hardware-aware optimizations, quantization, model pruning, compiler acceleration, and memory-efficient runtime strategies without locking you into any single framework.Written with clarity and real-world use in mind, the book features practical case studies, side-by-side performance comparisons, and up-to-date techniques from the cutting edge of AI deployment. If you're building, serving, or scaling LLMs in 2025, this is the performance engineering guide you've been waiting for.Key Features: - Framework-agnostic optimization techniques using PyTorch, ONNX Runtime, vLLM, llama.cpp, and more- Deep dive into quantization (INT8/4-bit), distillation, pruning, and KV caching- Hands-on examples with FastAPI, Hugging Face Transformers, and serverless deployment- Covers performance profiling, streaming, batching, and cost-efficient scaling- Future-proof insights on compiler-aware models, LoRA 2.0, and edge inferenceReady to build LLM systems that are faster, cheaper, and more scalable?Grab your copy of Optimizing LLM Performance today and deploy smarter. This item is printed on demand. Shipping may be from our UK warehouse or from our Australian or US warehouses, depending on stock availability. Codice articolo 9798294338459

Contatta il venditore

Compra nuovo

EUR 23,26

Spedizione EUR 42,88
Spedito da Regno Unito a U.S.A.

Quantit�: 1 disponibili

Aggiungi al carrello

Optimizing LLM Performance: Framework-Agnostic Techniques for Speed, Scalability, and Cost-Efficient Inference Across PyTorch, ONNX, vLLM, and More - Brossura

Sinossi

Risultati della ricerca per Optimizing LLM Performance: Framework-Agnostic Techniques...

Compra usato

Compra nuovo

Compra nuovo

Compra nuovo

Compra nuovo

Compra nuovo

Compra usato

Compra nuovo