Attention-Mechanism注意力模型.pptx
- 文档编号:18898898
- 上传时间:2024-02-10
- 格式:PPTX
- 页数:18
- 大小:2.52MB
Attention-Mechanism注意力模型.pptx
《Attention-Mechanism注意力模型.pptx》由会员分享,可在线阅读,更多相关《Attention-Mechanism注意力模型.pptx(18页珍藏版)》请在冰点文库上搜索。
IntroductiontoAttentionMechanismBoWuApr.28,2018HumanvisualattentionHumanvisualattentionAparticularlystudiedaspectisvisualattention:
manyanimalsfocusonspecificpartsoftheirvisualinputstocomputetheadequateresponses.NeuralprocessesinvolvingattentionhavebeenlargelystudiedinNeuroscienceandComputationalNeuroscience1,2Asimilaridea:
focusingonspecificpartsoftheinputhasbeenappliedinDeepLearning,forspeechrecognition,translation,imagecaption,QA.AttentioninDeepLearningAttentioninDeepLearningEncoder-DecoderEncoder-DecoderTheencoderencodeseverythingweneedtoknowaboutthesourcesentence.Itgeneratesavectorwhichfullycapturethemeaningofsourcesentence.Thedecodergeneratesatranslationsolelybasedonthevectorfromtheencoder.NEURALMACHINETRANSLATIONNEURALMACHINETRANSLATIONTakeanrecurrentneuralnetwork(RNN)-usuallyanLSTM-andencodeasentencewritteninlanguageA(English).TheRNNspitsoutahiddenstate,whichwerefertoasS.Thishiddenstatehopefullyrepresentsallthecontentofthesentence.ThishiddenstateSisthensuppliedtothedecoder,whichgeneratesthesentenceinlanguageB(German)wordbyword.encoderImagecaptionImagecaptiondecoderLimitationLimitationThelimitationisthattheonlylinkbetweenencodinganddecodingisafixedlengthofthesemanticvectorC.Thesemanticvectorcannotfullyrepresenttheentireinformationofsequence.Thelongertheinputsequence,themoreseriousthisphenomenonis.AttentionWecanextractacontextvectorthatsaweightedsummationoftheencoderoutputsdependingonhowrelevantwethinktheyare.ValueKeyQueryAttentionisAllYouNeedAttentionisAllYouNeedThispaperproposesanewsimplenetworkarchitecture,theTransformer,basedsolelyonattentionmechanisms,dispensingwithrecurrenceandconvolutionsentirely.Thispaperproposesamulti-headattentionmechanismmethodExperimentsontwomachinetranslationtasksshowthesemodelstobesuperiorinqualitywhilebeingmoreparallelizableandrequiringsignificantlylesstimetotrain.VaswaniA,ShazeerN,ParmarN,etal.AttentionisallyouneedC/AdvancesinNeuralInformationProcessingSystems.2017:
6000-6010.Multi-headattentionallowsthemodeltojointlyattendtoinformationfromdifferentrepresentationsubspacesatdifferentpositions.forlargevaluesofdk,thedotproductsgrowlargeinmagnitude,pushingthesoftmaxfunctionintoregionswhereithasextremelysmallgradients.Tocounteractthiseffect,wescalethedotproductsbySelfAttention(SelfAttention(intra-attention)ConclusionThenatureoftheattentionmechanismistopicktheinformationthatcontributesalottothetargetfromthesource.self-attentioncanbeaspecialcaseofgeneralAttention.Inself-attention,Q=K=Vandallunitsinthesequencearecalculatedbyattention.ReferencesSutskeverI,VinyalsO,LeQV.SequencetosequencelearningwithneuralnetworksC/Advancesinneuralinformationprocessingsystems.2014:
3104-3112.VinyalsO,ToshevA,BengioS,etal.Showandtell:
AneuralimagecaptiongeneratorC/ComputerVisionandPatternRecognition(CVPR),2015IEEEConferenceon.IEEE,2015:
3156-3164.BahdanauD,ChoK,BengioY.NeuralmachinetranslationbyjointlylearningtoalignandtranslateJ.arXivpreprintarXiv:
1409.0473,2014.VaswaniA,ShazeerN,ParmarN,etal.AttentionisallyouneedC/AdvancesinNeuralInformationProcessingSystems.2017:
6000-6010.https:
/
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- Attention Mechanism 注意力 模型