深度学习大数据分析中英文外文文献翻译文档格式.docx

文档编号：3672576
上传时间：2023-05-02
格式：DOCX
页数：18
大小：21.89KB

《深度学习大数据分析中英文外文文献翻译文档格式.docx》由会员分享，可在线阅读，更多相关《深度学习大数据分析中英文外文文献翻译文档格式.docx（18页珍藏版）》请在冰点文库上搜索。

深度学习大数据分析中英文外文文献翻译文档格式.docx

标题：

PrototypingaGPGPUNeuralNetworkforDeep-LearningBigDataAnalysis

作者：

AlcidesFonseca, BrunoCabral

期刊：

BigDataResearch，卷8：

50-56页年份：

2017

原文

Abstract

BigDataconcernswithlarge-volumecomplexgrowingdata.Giventhefastdevelopmentofdatastorageandnetwork,organizationsarecollectinglargeever-growingdatasetsthatcanhaveusefulinformation.Inordertoextractinformationfromthesedatasetswithinusefultime,itisimportanttousedistributedandparallelalgorithms.Onecommonusageofbigdataismachinelearning,inwhichcollecteddataisusedtopredictfuturebehavior.Deep-LearningusingArtificialNeuralNetworksisoneofthepopularmethodsforextractinginformationfromcomplexdatasets.Deep-learningiscapableofmorecreatingcomplexmodelsthantraditionalprobabilisticmachinelearningtechniques.

Thisworkpresentsastep-by-stepguideonhowtoprototypeaDeep-LearningapplicationthatexecutesbothonGPUandCPUclusters.PythonandRedisarethecoresupportingtoolsofthisguide.ThistutorialwillallowthereadertounderstandthebasicsofbuildingadistributedhighperformanceGPUapplicationinafewhours.Sincewedonotdependonanydeep-learningapplicationorframework—weuselow-levelbuilding

blocks—thistutorialcanbeadjustedforanyotherparallelalgorithmthereadermightwanttoprototypeonBigData.Finally,wewilldiscusshowtomovefromaprototypetoafullyblownproductionapplication.

Keywords:

Big-data;

Deep-learning;

Prototyping;

GPGPU;

Cluster;

Parallelprogramming

Introduction

DeepLearningreferstotheusageofArtificialNeuralNetworks（ANNorNN）withseveralhiddenlayersusedfordatawithahighdimensionality.AcommonexampleandbenchmarkforDeepLearningisimageclassificationfromtheImageNetdataset.ANNscanbeusedforclassificationtasks,withseveralapplicationsinindustry,businessandscience.Examplesofapplicationsincludecharacterrecognitioninscanneddocuments,predictingbankruptcyorhealthcomplications.AutonomousdrivingalsomakesheavyuseofANNs.AnANNbeginswithrandomweights,practicallydecidingeverythingatrandom.BytrainingtheANNwithseveralexistinginstancesoftheproblem,onecanevaluatetheerrorproduced.Weightsarethenadjusted,takingintoaccountifitoverlyorunderlyestimatedthefinalvalue.

Inordertopredictvalues,ANNsarebuiltconnectinglayersofneurons.ANNsusethefirstlayerofneuronsforeachinputfeature,andthe

finallayerfortheclassificationoutput.Fig.1showsanexampleofanANNwithfourinputneurons,fourneuronsinthehiddenlayerandtwooutputneurons.Allneuronsinonelayerareconnectedtoalltheneuronsinthefollowinglayer.

Whenthenumberoffeaturesincreases（highdimensionality）,thenumberofneuronsinthehiddenlayersincreasesaswell,inordertocompensateforthepossibleinteractionsofinputneurons.However,aruleofthumbistouseonlyonehiddenlayerwiththesamenumberofhiddenneuronsasthereareinputneurons.ThesecondscalabilityissuewithANNsisthatforahighaccuracy,theyhavetobetrainedwithalargedataset.Typically,toachieveagoodaccuracyscore,thenumberofinstancesshouldbethreeordersofmagnitudehigherthanthenumberoffeatures.Thus,wereachapointinwhichweneedtotrainanANNoverseveraliterations,usingahighnumberoffeaturesandinstances.Intheseconditions,traininganANNbecomesacomputationallyintensiveoperation,highlydemandingintermsofprocessing,memoryanddiskusage.Astheamountofdataavailablefortraininggoesaboveaterabyte,itbecomesBigDataproblem.

ThesolutionforBigDataprocessingistodistributethecomputationacrossdifferentmachines,splittingdataamongthemandmergingresultsafterwards.InMap-Reduceapproaches,itispossibletodividethecomputationintoindependentsub-problemsthatcanbecombinedto

produceafinalresult.HadoopandSparkarethemostusedframeworksforBigDataprocessing.

ANNsaredescribedbythefollowingcharacteristics:

layout（thenumberoflayersandneuronsoneachlayer）andtheweightsofconnectionsbetweenneurons（thesecondattributeisdependentonthefirst）.WhentraininganANNforaspecificproblemdataset,theweightsarebeingadjustedtominimizetheoutputerror.BecausethepredictionofANNscanbedescribedasmatrixoperations（wearemultiplyingthesameweightstoaeachrowoffeaturesofprobleminstances）,graphicalprocessingunits（GPUs）areusuallyagoodsolutionforimprovingperformanceandreducetrainingtimes.GPUsweredesignedtoperformmatrixoperationsinthecontextofvideoprocessing,buthavebeenwidelyusedforotherends.ThistechniqueiscommonlyreferredtoasGeneralPurposeGPUprogramming（GPGPU）.Problemanddataset

BeforestartingimplementingtheANN,itisnecessarytodefinetheproblemdatasettousefordevelopmentandtestingpurposes.Itwouldbeunpracticaltousealargedatasetindevelopment,sinceitwouldimposeanunbearableoverheadineachtestiteration.Instead,asmallsubsetofalargerdatasetshouldbeusedforprototyping,andtheremainingdatashouldbeusedforevaluatingthecompletesolution.Assuch,wearegoingtousetheWineDataSet,whichhasbeenutilizedforevaluatingclassifierswhenfacedwithhighdimensionality（ahighnumberoffeatures）.The

datasethas178instances,areasonablenumberfortestingourprototype.Eachinstancehas13features（eitherrealorintegervalues）andthreeoutputvalues.Sinceeachinstancecanbelongtooneofthreeclasses,theoutputvaluesaresettooneiftheinstancebelongstothecorrespondingclass.Inthistutorial,wewillshowhowtobuildanANNtrainedtoclassifywinewhenfacedwithanunknowninstance.TheANNwillhave13inputneuronsandthreeoutputneurons,matchingthedatasetformat.Thereisonehiddenlayerand13inputneurons.Bearinmindthatthisisonlyanexampleconfiguration,manyothersareacceptableandmayevenhavebetterresults.

Inordertoadaptmoreeasilytodifferentproblems,ANNsuseattheircoreanon-linearfunction.Inourprototypewewillusethesinefunction,themostcommonfunctionforthisuse.Inordertoimprovethelearningperformance,wenormalizeourdatasettovaluesbetween0and1.AdistributedneuralnetworkusingredisIntheprevioussectionwewroteafunctiontotrainanANNbasedonagivendataset.ForBigData,thingsaremorecomplex.Tohandlelargeamountsofdatawithhightdimensionality,itisusefultodistributestorageandcomputationdistributedacrossmachines.Map-Reduceisaverycommonparadigmforparallelizingalgorithmsacrossdifferentmachines.Theproblemisdividedinseveralsub-problems,eachconcerningasubsetofthedata.Eachsub-problemissolvedinadifferentmachine,andtheresultsarecombinedtoproducethe

solutionfortheoverallproblem.

ParallelizinganANNcanbedonebysubdividingthewholeintoseveraltrainingsets.Eachmachinetrainstheneuralnetworkonitsdataset,obtainingdifferentweightmatricespermachine.Then,matricesarecombinedbyaveragingeachvalueacrossmatrices.TheexampleinFig.2showsamasterandtwoworkerprocesses.Themasterprocessmanagesexecution,createstherequestssenttotheworkers,waitsforresponsesandmergesresults.Trainingisasynchronousandoccursinparallel,andconsumesthelargerpartofthecomputationtimebecauseithastoberepeatedahighnumberoftimes.RequestsandResponseshavetobeexchangedoverthenetwork,inordertoallowdistinctmachinestocollaborateinthecomputationalwork.Tothisend,weusedredisdatabaseasamessagebroker.Severalothermessagequeuesystemscouldhavebeenused,butwedecidedtoworkonredisbecauseofitsflexibility,noconfigurationisnecessaryandtheinterfaceisverystraightforward.OtheralternativeswillbediscussedinSection6.Weonlyneedasingleredisinstance,runningonthemasternode,toenqueuerequestsandresponses.Tokeepthingsassimpleaspossible,requestsandresponsesholdthesamedata:

?

Numberofiterationstobeperformed（canbeusedtocontrolthetaskduration）

?

Theboundsofthedatasubsettouse

TheconfigurationoftheANN

Inrequests,theANNconfigurationistheinitialmasterconfiguration.Inresponses,theconfigurationistheresultoftheworkertraining.Sinceredisonlyacceptsstringsasvalues,itisnecessarytoencodeanddecodeinformation.Listing1.6showsanexampleofapossibleencoding.EncodingNumpyarraysisnottrivial.Arrayshavetobereducedtoonedimensionandconvertedtostring.Thedimensionsanddatatypeofthearrayarecommunicatedseparatelyasmetadata.Discussionandhomework

Justlikeanyprototype,thisisnotafinishedproduct.Assuch,thereareseveralshortcomingsandsignificantroomforimprovement.Thissectionaddressesissueswithourworkandproposesalternativesthatthereadermayopttopursue.

Pythonisnotthebestlanguageintermsofperformance,givenitsoverheadininterpretingcode,expensivedynamicdatastructures（suchaslists）andtheGlobalInterpreterLock（GIL）.Whilethisistrue,inourimplementation,mostofthecomputationdoesnotexecuteinPython:

theGPUversioncompilesPythontoLLVM,whichiscompiledtotheNvidiaPTXformat,similarlytohowcudaworks;

and,theCPUversionexecutesthematrixmultiplicationsinC,withCstructures,thankstotheNumpylibrary.But,itispossibletoreduceevenfurthertheoverheadofPython

code