Designing Machine Learning Systems_ An Iterative Process for Production-Ready Applications.pdf
- 文档编号:11592581
- 上传时间:2023-06-01
- 格式:PDF
- 页数:389
- 大小:15.49MB
Designing Machine Learning Systems_ An Iterative Process for Production-Ready Applications.pdf
《Designing Machine Learning Systems_ An Iterative Process for Production-Ready Applications.pdf》由会员分享,可在线阅读,更多相关《Designing Machine Learning Systems_ An Iterative Process for Production-Ready Applications.pdf(389页珍藏版)》请在冰点文库上搜索。
ChipHuyenDesigningMachineLearningSystemsAnIterativeProcessforProduction-ReadyApplicationsMACHINELEARNING“Thisis,simply,theverybestbookyoucanreadabouthowtobuild,deploy,andscalemachinelearningmodelsatacompanyformaximumimpact.”JoshWillsSoftwareEngineeratWeaveGridandformerDirectorofDataEngineering,Slack“Inabloomingbutchaoticecosystem,thisprincipledviewonend-to-endMLisbothyourmapandyourcompass:
amust-readforpractitionersinsideandoutsideofBigTech.”JacopoTagliabueDirectorofAI,CoveoDesigningMachineLearningSystemsUS$59.99CAN$74.99ISBN:
978-1-098-10796-3Twitter:
Machinelearningsystemsarebothcomplexandunique.Complexbecausetheyconsistofmanydifferentcomponentsandinvolvemanydifferentstakeholders.Uniquebecausetheyredatadependent,withdatavaryingwildlyfromoneusecasetothenext.Inthisbook,youlllearnaholisticapproachtodesigningMLsystemsthatarereliable,scalable,maintainable,andadaptivetochangingenvironmentsandbusinessrequirements.AuthorChipHuyen,co-founderofClaypotAI,considerseachdesigndecisionsuchashowtoprocessandcreatetrainingdata,whichfeaturestouse,howoftentoretrainmodels,andwhattomonitorinthecontextofhowitcanhelpyoursystemasawholeachieveitsobjectives.Theiterativeframeworkinthisbookusesactualcasestudiesbackedbyamplereferences.Thisbookwillhelpyoutacklescenariossuchas:
EngineeringdataandchoosingtherightmetricstosolveabusinessproblemAutomatingtheprocessforcontinuallydeveloping,evaluating,deploying,andupdatingmodelsDevelopingamonitoringsystemtoquicklydetectandaddressissuesyourmodelsmightencounterinproductionArchitectinganMLplatformthatservesacrossusecasesDevelopingresponsibleMLsystemsChipHuyenisco-founderofClaypotAI,aplatformforreal-timemachinelearning.ThroughherworkatNVIDIA,Netflix,andSnorkelAI,sheshelpedsomeoftheworldslargestorganizationsdevelopanddeployMLsystems.ChipbasedthisbookonherlecturesforCS329S:
MachineLearningSystemsDesign,acoursesheteachesatStanfordUniversity.PraiseforDesigningMachineLearningSystemsThereissomuchinformationoneneedstoknowtobeaneffectivemachinelearningengineer.Itshardtocutthroughthechafftogetthemostrelevantinformation,butChiphasdonethatadmirablywiththisbook.IfyouareseriousaboutMLinproduction,andcareabouthowtodesignandimplementMLsystemsendtoend,thisbookisessential.LaurenceMoroney,AIandMLLead,GoogleOneofthebestresourcesthatfocusesonthefirstprinciplesbehinddesigningMLsystemsforproduction.Amust-readtonavigatetheephemerallandscapeoftoolingandplatformoptions.GokuMohandas,FounderofMadeWithMLChipsmanualisthebookwedeserveandtheoneweneedrightnow.Inabloomingbutchaoticecosystem,thisprincipledviewonend-to-endMLisbothyourmapandyourcompass:
amust-readforpractitionersinsideandoutsideofBigTechespeciallythoseworkingat“reasonablescale.”Thisbookwillalsoappealtodataleaderslookingforbestpracticesonhowtodeploy,manage,andmonitorsystemsinthewild.JacopoTagliabue,DirectorofAI,Coveo;Adj.ProfessorofMLSys,NYUThisis,simply,theverybestbookyoucanreadabouthowtobuild,deploy,andscalemachinelearningmodelsatacompanyformaximumimpact.Chipisamasterfulteacher,andthebreadthanddepthofherknowledgeisunparalleled.JoshWills,SoftwareEngineeratWeaveGridandformerDirectorofDataEngineering,SlackThisisthebookIwishIhadreadwhenIstartedasanMLengineer.ShreyaShankar,MLOpsPhDStudentDesigningMachineLearningSystemsisawelcomeadditiontothefieldofappliedmachinelearning.Thebookprovidesadetailedguideforpeoplebuildingend-to-endmachinelearningsystems.ChipHuyenwritesfromherextensive,hands-onexperiencebuildingreal-worldmachinelearningapplications.BrianSpiering,DataScienceInstructoratMetisChipistrulyaworld-classexpertonmachinelearningsystems,aswellasabrilliantwriter.Bothareevidentinthisbook,whichisafantasticresourceforanyonelookingtolearnaboutthistopic.AndreyKurenkov,PhDCandidateattheStanfordAILabChipHuyenhasproducedanimportantadditiontothecanonofmachinelearningliteratureonethatisdeeplyliterateinMLfundamentals,buthasamuchmoreconcreteandpracticalapproachthanmost.Thefocusonbusinessrequirementsaloneisuncommonandvaluable.ThisbookwillresonatewithengineersgettingstartedwithMLandwithothersinanypartoftheorganizationtryingtounderstandhowMLworks.ToddUnderwood,SeniorEngineeringDirectorforMLSRE,Google,andCoauthorofReliableMachineLearningChipHuyenDesigningMachineLearningSystemsAnIterativeProcessforProduction-ReadyApplicationsBostonFarnhamSebastopolTokyoBeijingBostonFarnhamSebastopolTokyoBeijing978-1-098-10796-3LSIDesigningMachineLearningSystemsbyChipHuyenCopyright2022HuyenThiKhanhNguyen.Allrightsreserved.PrintedintheUnitedStatesofAmerica.PublishedbyOReillyMedia,Inc.,1005GravensteinHighwayNorth,Sebastopol,CA95472.OReillybooksmaybepurchasedforeducational,business,orsalespromotionaluse.Onlineeditionsarealsoavailableformosttitles(http:
/).Formoreinformation,contactourcorporate/institutionalsalesdepartment:
800-998-9938or.AcquisitionsEditor:
NicoleButterfieldDevelopmentEditor:
JillLeonardProductionEditor:
GregoryHymanCopyeditor:
nSight,Inc.Proofreader:
PiperEditorialConsulting,LLCIndexer:
nSight,Inc.InteriorDesigner:
DavidFutatoCoverDesigner:
KarenMontgomeryIllustrator:
KateDulleaMay2022:
FirstEditionRevisionHistoryfortheFirstEdition2022-05-17:
FirstReleaseSeehttp:
/forreleasedetails.TheOReillylogoisaregisteredtrademarkofOReillyMedia,Inc.DesigningMachineLearningSystems,thecoverimage,andrelatedtradedressaretrademarksofOReillyMedia,Inc.Theviewsexpressedinthisworkarethoseoftheauthor,anddonotrepresentthepublishersviews.Whilethepublisherandtheauthorhaveusedgoodfaitheffortstoensurethattheinformationandinstructionscontainedinthisworkareaccurate,thepublisherandtheauthordisclaimallresponsibilityforerrorsoromissions,includingwithoutlimitationresponsibilityfordamagesresultingfromtheuseoforrelianceonthiswork.Useoftheinformationandinstructionscontainedinthisworkisatyourownrisk.Ifanycodesamplesorothertechnologythisworkcontainsordescribesissubjecttoopensourcelicensesortheintellectualpropertyrightsofothers,itisyourresponsibilitytoensurethatyourusethereofcomplieswithsuchlicensesand/orrights.TableofContentsPreface.ix1.OverviewofMachineLearningSystems.1WhentoUseMachineLearning3MachineLearningUseCases9UnderstandingMachineLearningSystems12MachineLearninginResearchVersusinProduction12MachineLearningSystemsVersusTraditionalSoftware22Summary232.IntroductiontoMachineLearningSystemsDesign.25BusinessandMLObjectives26RequirementsforMLSystems29Reliability29Scalability30Maintainability31Adaptability31IterativeProcess32FramingMLProblems35TypesofMLTasks36ObjectiveFunctions40MindVersusData43Summary463.DataEngineeringFundamentals.49DataSources50DataFormats53JSON54iiiRow-MajorVersusColumn-MajorFormat54TextVersusBinaryFormat57DataModels58RelationalModel59NoSQL63StructuredVersusUnstructuredData66DataStorageEnginesandProcessing67TransactionalandAnalyticalProcessing67ETL:
Extract,Transform,andLoad70ModesofDataflow72DataPassingThroughDatabases72DataPassingThroughServices73DataPassingThroughReal-TimeTransport74BatchProcessingVersusStreamProcessing78Summary794.TrainingData.81Sampling82NonprobabilitySampling83SimpleRandomSampling84StratifiedSampling84WeightedSampling85ReservoirSampling86ImportanceSampling87Labeling88HandLabels88NaturalLabels91HandlingtheLackofLabels94ClassImbalance102ChallengesofClassImbalance103HandlingClassImbalance105DataAugmentation113SimpleLabel-PreservingTransformations114Perturbation114DataSynthesis116Summary1185.FeatureEngineering.119LearnedFeaturesVersusEngineeredFeatures120CommonFeatureEngineeringOperations123HandlingMissingValues123Scaling126iv|TableofContentsDiscretization128EncodingCategoricalFeatures129FeatureCrossing132DiscreteandContinuousPositionalEmbeddings133DataLeakage135CommonCausesforDataLeakage137DetectingDataLeakage140EngineeringGoodFeatures141FeatureImportance142FeatureGeneralization144Summary1466.ModelDevelopmentandOfflineEvaluation.149ModelDevelopmentandTraining150EvaluatingMLModels150Ensembles156ExperimentTrackingandVersioning162DistributedTraining168AutoML172ModelOfflineEvaluation178Baselines179EvaluationMethods181Summary1887.ModelDeploymentandPredictionService.191MachineLearningDeploymentMyths194Myth1:
YouOnlyDeployOneorTwoMLModelsataTime194Myth2:
IfWeDontDoAnything,ModelPerformanceRemainstheSame195Myth3:
YouWontNeedtoUpdateYourModelsasMuch196Myth4:
MostMLEngineersDontNeedtoWorryAboutScale196BatchPredictionVersusOnlinePrediction197FromBatchPredictiontoOnlinePrediction201UnifyingBatchPipelineandStreamingPipeline203ModelCompression206Low-RankFactorization206KnowledgeDistillation208Pruning208Quantization209MLontheCloudandontheEdge212CompilingandOptimizingModelsforEdgeDevices214MLinBrowsers222Summary223TableofContents|v8.DataDistributionShiftsandMonitoring.225CausesofMLSystemFailures226SoftwareSystemFailures227ML-SpecificFailures229DataDistributionShifts237TypesofDataDistributionShifts237GeneralDataDistributionShifts241DetectingDataDistributionShifts242AddressingDataDistributionShifts248MonitoringandObservability
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
下载 | 加入VIP,免费下载 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- Designing Machine Learning Systems_ An Iterative Process for Production-Ready Applications Productio

链接地址:https://www.bingdoc.com/p-11592581.html