更改

添加1字节 、 2022年1月9日 (日) 12:10
第428行: 第428行:     
=== 混合层级深度模型(Compound hierarchical-deep models) ===
 
=== 混合层级深度模型(Compound hierarchical-deep models) ===
混合层级深度模型构成了带非参数[https://en.wikipedia.org/wiki/Bayesian_network 贝叶斯模型]的深度网络。[https://en.wikipedia.org/wiki/Feature_(machine_learning) 特征]可以使用像DBN<ref name="hinton2006" />,DBM<ref name="ref3">{{cite journal|last1=Hinton|first1=Geoffrey|last2=Salakhutdinov|first2=Ruslan|date=2009|title=Efficient Learning of Deep Boltzmann Machines|url=http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS09_SalakhutdinovH.pdf|volume=3|pages=448–455}}</ref>,深度自动编码器<ref name="ref15">{{cite journal|last2=Bengio|first2=Yoshua|last3=Louradour|first3=Jerdme|last4=Lamblin|first4=Pascal|date=2009|title=Exploring Strategies for Training Deep Neural Networks|url=http://dl.acm.org/citation.cfm?id=1577070|journal=The Journal of Machine Learning Research|volume=10|pages=1–40|last1=Larochelle|first1=Hugo}}</ref>,卷积变体<ref name="ref39">{{cite journal|last2=Carpenter|first2=Blake|date=2011|title=Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning|url=http://www.iapr-tc11.org/archive/icdar2011/fileup/PDF/4520a440.pdf|journal=|volume=|pages=440–445|via=|last1=Coates|first1=Adam}}</ref><ref name="ref40">{{cite journal|last2=Grosse|first2=Roger|date=2009|title=Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations|url=http://portal.acm.org/citation.cfm?journal=Proceedings of the 26th Annual International Conference on Machine Learning|pages=1–8|last1=Lee|first1=Honglak}}</ref>,ssRAM,<ref name="ref32" />深度编码网络,<ref name="ref41">{{cite journal|last2=Zhang|first2=Tong|date=2010|title=Deep Coding Network|url=http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2010_1077.pdf|journal=Advances in Neural . . .|pages=1–9|last1=Lin|first1=Yuanqing}}</ref>带稀疏特征学习的DBN,<ref name="ref42">{{cite journal|last2=Boureau|first2=Y-Lan|date=2007|title=Sparse Feature Learning for Deep Belief Networks|url=http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2007_1118.pdf|journal=Advances in Neural Information Processing Systems|volume=23|pages=1–8|last1=Ranzato|first1=Marc Aurelio}}</ref>RNN,<ref name="ref43">{{cite journal|last2=Lin|first2=Clif|date=2011|title=Parsing Natural Scenes and Natural Language with Recursive Neural Networks|url=http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Socher_125.pdf|journal=Proceedings of the 26th International Conference on Machine Learning|last1=Socher|first1=Richard}}</ref>条件DBN,<ref name="ref44">{{cite journal|last2=Hinton|first2=Geoffrey|date=2006|title=Modeling Human Motion Using Binary Latent Variables|url=http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_693.pdf|journal=Advances in Neural Information Processing Systems|last1=Taylor|first1=Graham}}</ref>去噪自动编码器的深度结构学习<ref name="ref45">{{cite journal|last2=Larochelle|first2=Hugo|date=2008|title=Extracting and composing robust features with denoising autoencoders|url=http://portal.acm.org/citation.cfm?journal=Proceedings of the 25th international conference on Machine learning – ICML '08|pages=1096–1103|last1=Vincent|first1=Pascal}}</ref>。这提供了更好的表示,允许更快的学习和高维数据下更精确的分类。然而,这些结果在学习带少示例的异常类时表现很差,因为所有的网络单元都参与表示输入(分布式表征)并且必须一起被调整(高[https://en.wikipedia.org/wiki/Degree_of_freedom 自由度])。限制自由度减少了要学习的参数数量,使从新的例子中的新的类学习更容易。[https://en.wikipedia.org/wiki/Hierarchical_Bayesian_model 层次贝叶斯模型]允许从少量示例中学习,例如<ref name="ref34">{{cite journal|last2=Perfors|first2=Amy|last3=Tenenbaum|first3=Joshua|date=2007|title=Learning overhypotheses with hierarchical Bayesian models|journal=Developmental Science|volume=10|issue=3|pages=307–21|last1=Kemp|first1=Charles}}</ref><ref name="ref37">{{cite journal|last2=Tenenbaum|first2=Joshua|date=2007|title=Word learning as Bayesian inference|journal=Psychol. Rev.|volume=114|issue=2|pages=245–72|last1=Xu|first1=Fei}}</ref><ref name="ref46">{{cite journal|last2=Polatkan|first2=Gungor|date=2011|title=The Hierarchical Beta Process for Convolutional Factor Analysis and Deep Learning|url=http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Chen_251.pdf|journal=Machine Learning . . .|last1=Chen|first1=Bo}}</ref><ref name="ref47">{{cite journal|last2=Fergus|first2=Rob|date=2006|title=One-shot learning of object categories|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence|volume=28|issue=4|pages=594–611|last1=Fei-Fei|first1=Li}}</ref><ref name="ref48">{{cite journal|last2=Dunson|first2=David|date=2008|title=The Nested Dirichlet Process|url=http://amstat.tandfonline.com/doi/full/10.1198/016214508000000553|journal=Journal of the American Statistical Association|volume=103|issue=483|pages=1131–1154|last1=Rodriguez|first1=Abel}}</ref>计算机视觉,[https://en.wikipedia.org/wiki/Statistics 统计学] 和认知科学。
+
混合层级深度模型构成了带非参数[https://en.wikipedia.org/wiki/Bayesian_network 贝叶斯模型]的深度网络。[https://en.wikipedia.org/wiki/Feature_(machine_learning) 特征]可以使用像DBN<ref name="hinton2006" />,DBM<ref name="ref3">{{cite journal|last1=Hinton|first1=Geoffrey|last2=Salakhutdinov|first2=Ruslan|date=2009|title=Efficient Learning of Deep Boltzmann Machines|url=http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS09_SalakhutdinovH.pdf|volume=3|pages=448–455}}</ref>,深度自动编码器<ref name="ref15">{{cite journal|last2=Bengio|first2=Yoshua|last3=Louradour|first3=Jerdme|last4=Lamblin|first4=Pascal|date=2009|title=Exploring Strategies for Training Deep Neural Networks|url=http://dl.acm.org/citation.cfm?id=1577070|journal=The Journal of Machine Learning Research|volume=10|pages=1–40|last1=Larochelle|first1=Hugo}}</ref>,卷积变体<ref name="ref39">{{cite journal|last2=Carpenter|first2=Blake|date=2011|title=Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning|url=http://www.iapr-tc11.org/archive/icdar2011/fileup/PDF/4520a440.pdf|journal=|volume=|pages=440–445|via=|last1=Coates|first1=Adam}}</ref><ref name="ref40">{{cite journal|last2=Grosse|first2=Roger|date=2009|title=Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations|url=http://portal.acm.org/citation.cfm?|journal=Proceedings of the 26th Annual International Conference on Machine Learning|pages=1–8|last1=Lee|first1=Honglak}}</ref>,ssRAM,<ref name="ref32" />深度编码网络,<ref name="ref41">{{cite journal|last2=Zhang|first2=Tong|date=2010|title=Deep Coding Network|url=http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2010_1077.pdf|journal=Advances in Neural . . .|pages=1–9|last1=Lin|first1=Yuanqing}}</ref>带稀疏特征学习的DBN,<ref name="ref42">{{cite journal|last2=Boureau|first2=Y-Lan|date=2007|title=Sparse Feature Learning for Deep Belief Networks|url=http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2007_1118.pdf|journal=Advances in Neural Information Processing Systems|volume=23|pages=1–8|last1=Ranzato|first1=Marc Aurelio}}</ref>RNN,<ref name="ref43">{{cite journal|last2=Lin|first2=Clif|date=2011|title=Parsing Natural Scenes and Natural Language with Recursive Neural Networks|url=http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Socher_125.pdf|journal=Proceedings of the 26th International Conference on Machine Learning|last1=Socher|first1=Richard}}</ref>条件DBN,<ref name="ref44">{{cite journal|last2=Hinton|first2=Geoffrey|date=2006|title=Modeling Human Motion Using Binary Latent Variables|url=http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_693.pdf|journal=Advances in Neural Information Processing Systems|last1=Taylor|first1=Graham}}</ref>去噪自动编码器的深度结构学习<ref name="ref45">{{cite journal|last2=Larochelle|first2=Hugo|date=2008|title=Extracting and composing robust features with denoising autoencoders|url=http://portal.acm.org/citation.cfm?journal=Proceedings of the 25th international conference on Machine learning – ICML '08|pages=1096–1103|last1=Vincent|first1=Pascal}}</ref>。这提供了更好的表示,允许更快的学习和高维数据下更精确的分类。然而,这些结果在学习带少示例的异常类时表现很差,因为所有的网络单元都参与表示输入(分布式表征)并且必须一起被调整(高[https://en.wikipedia.org/wiki/Degree_of_freedom 自由度])。限制自由度减少了要学习的参数数量,使从新的例子中的新的类学习更容易。[https://en.wikipedia.org/wiki/Hierarchical_Bayesian_model 层次贝叶斯模型]允许从少量示例中学习,例如<ref name="ref34">{{cite journal|last2=Perfors|first2=Amy|last3=Tenenbaum|first3=Joshua|date=2007|title=Learning overhypotheses with hierarchical Bayesian models|journal=Developmental Science|volume=10|issue=3|pages=307–21|last1=Kemp|first1=Charles}}</ref><ref name="ref37">{{cite journal|last2=Tenenbaum|first2=Joshua|date=2007|title=Word learning as Bayesian inference|journal=Psychol. Rev.|volume=114|issue=2|pages=245–72|last1=Xu|first1=Fei}}</ref><ref name="ref46">{{cite journal|last2=Polatkan|first2=Gungor|date=2011|title=The Hierarchical Beta Process for Convolutional Factor Analysis and Deep Learning|url=http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Chen_251.pdf|journal=Machine Learning . . .|last1=Chen|first1=Bo}}</ref><ref name="ref47">{{cite journal|last2=Fergus|first2=Rob|date=2006|title=One-shot learning of object categories|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence|volume=28|issue=4|pages=594–611|last1=Fei-Fei|first1=Li}}</ref><ref name="ref48">{{cite journal|last2=Dunson|first2=David|date=2008|title=The Nested Dirichlet Process|url=http://amstat.tandfonline.com/doi/full/10.1198/016214508000000553|journal=Journal of the American Statistical Association|volume=103|issue=483|pages=1131–1154|last1=Rodriguez|first1=Abel}}</ref>计算机视觉,[https://en.wikipedia.org/wiki/Statistics 统计学] 和认知科学。
 
混合HD结构目的是整合HB和深度网络的特征。混合HDP-DBM结构是一种作为层级模型的[https://en.wikipedia.org/wiki/Hierarchical_Dirichlet_process 层级狄利克雷过程]与DBM结构合并。这是全[https://en.wikipedia.org/wiki/Generative_model 生成模型],从流经模型层的抽象概念中生成,它可以分析在异常类中看起来“合理的”自然的新例子。所以的层级通过最大化一个共同[https://en.wikipedia.org/wiki/Log_probability 对数概率][https://en.wikipedia.org/wiki/Score_(statistics) 分数]被共同学习。<ref name="ref38">{{cite journal|last2=Joshua|first2=Tenenbaum|date=2012|title=Learning with Hierarchical-Deep Models|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence|volume=35|issue=8|pages=1958–71|last1=Ruslan|first1=Salakhutdinov}}</ref>
 
混合HD结构目的是整合HB和深度网络的特征。混合HDP-DBM结构是一种作为层级模型的[https://en.wikipedia.org/wiki/Hierarchical_Dirichlet_process 层级狄利克雷过程]与DBM结构合并。这是全[https://en.wikipedia.org/wiki/Generative_model 生成模型],从流经模型层的抽象概念中生成,它可以分析在异常类中看起来“合理的”自然的新例子。所以的层级通过最大化一个共同[https://en.wikipedia.org/wiki/Log_probability 对数概率][https://en.wikipedia.org/wiki/Score_(statistics) 分数]被共同学习。<ref name="ref38">{{cite journal|last2=Joshua|first2=Tenenbaum|date=2012|title=Learning with Hierarchical-Deep Models|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence|volume=35|issue=8|pages=1958–71|last1=Ruslan|first1=Salakhutdinov}}</ref>
  
7,129

个编辑