殷苌茗 照片

殷苌茗

博士 教授

所属大学: 长沙理工大学

所属学院: 计算机与通信工程学院

个人主页:
https://www.csust.edu.cn/jtxy/info/1280/16601.htm

个人简介

教育背景:

北京师范大学

学士

1998

国防科技大学

硕士

2006

上海大学

博士

所获学术荣誉及学术影响:

1.1998年度获长沙电力学院“优秀教师”

2.1998年度获系“优秀毕业实习指导教师”

3.2000年度获长沙电力学院“优秀教师”

4.2000年度获长沙电力学院“优质课奖”

5.2001年度获长沙电力学院“优秀教师”

6.2002年度获长沙电力学院“优秀教师”

7.2002年度获“华中电力集团奖教基金奖”三等奖

8.2003年度湖南省高等学校青年骨干教师培养对象

研究领域

算法与计算机软件;机器学习与智能控制

目前研究领域:

算法与计算机软件;机器学习与智能控制

已完成或已在承担的主要课题:

1.智能体在部分可观测马尔可夫环境下的激励学习研究,国家自然科学基金,2002-2005

2.多时间尺度风险敏感度MDP研究,理工大学科研基金

3.湖南省青年骨干教师培养对象,湖南省教育厅

4.1火力发电厂分布式数据采集与故障诊断系统,湖南省电力局科研项目(1998年),已结题,6万元,主持。

5.智能体在部分可观测马尔可夫环境下的激励学习研究,国家自然科学基金项目,在研,20万元,主研。

6.江西省地区电网负荷预测与分析系统,江西省电力总公司,已结题,50万元,主研。

7.教学管理软件的开发与推广,长沙电力学院教研项目(2000年),已结题,0.5万元,主研。

8.激励学习算法的收敛性研究,湖南省教委科研项目(2000年),已结题,0.5万元,主研。

9.激励学习智能体最优控制策略及其在微经济环境下的决策问题,湖南省教育厅科研基金项目(2007),在研,1万元,主持。

10.7、多时间参数风险敏感度MDP研究,长沙理工大学科研基金项目(2006),在研,3万元,主持。

近期论文

1.OptimalEqualityforMulti-TimeScaleRisk-SensitiveMarkovDecisionProcesses,ProceedingsinISCST,2005,Ningbo,China

2.AutomaticDiscoveryofSubgoalsforSequentialDecisionProblemsUsingPotentialFields,ProceedingsinICNC,2005:384-391.

3.求解POMDP的动态合并激励学习算法,计算机工程,No.19,2005

4.基于动态规划的激励学习遗忘算法,计算机工程与应用,2004,Vol40,No.20

5.ReinforcementLearningForgettingAlgorithmBasedonDynamicProgramming,JournalofComputerEngineeringandApplications,2004,Vol40,No.20.

6.AverageAsymptoticTemporalDifferenceLearningForgettingAlgorithmonEligibilityTrace,JournalofChangshaUniversityofElectricPower,2003(4).

7.ReinforcementLearningAlgorithmforSolvingRTDPwithVariationalEnvironment.ICGSTInternationalJournalonArtificialIntelligenceandMachineLearning(AIML),Volume(7),Issue(I),pp17-21.

8.ReinforcementLearningAlgorithmsBasedonmGAandEAwithPolicyIterations.LectureNotesinComputerScience(includingsubseriesLectureNotesinArtificialIntelligenceandLectureNotesinBioinformatics)Bio-InspiredComputationalIntelligenceandApplications-InternationalConferenceonLifeSystemModelingandSimulation,LSMS2007,Proceedingsv4688LNCS2007.

9.Risk-SensitiveReinforcementLearningAlgorithmswithGeneralizedAverageCriterion.AppliedMathematicsandMechanics-EnglishEdition,2007,V28,N3(MAR),pp405-416.

10.GlobalAttractorforKGSLatticeSystem.AppliedMathematicsandMechanics-EnglishEdition,2007,V28,N5(MAC),pp619-628.

11.FusedSarsa(lambda)LearningAlgorithmBased-onMulti-agent.JournalofComputerEngineeringandApplications,2008,44(4),pp182-183.

12.AutomaticDiscoveryofSubgoalsforSequentialDecisionProblemsUsingPotentialFields.2005InternationalConferenceonNaturalComputation/2005InternationalConferenceonFuzzySystemsandknowledgeDiscovery(ICNC'05-FSKD'05),IEEE.27-29August2005,Changsha,China.(LectureNotesinComputerScience,v3612,nPARTIII,AdvancesinNaturalComputation:FirstInternationalConference,ICNC2005.Proceedings,2005,pp384-391)

13.OptimalEqualityforMulti-TimeScaleRisk-SensitiveMarkovDecisionProcesses.ProceedingsintheInternationalSymposiumonComputerScienceandTechnology2005,Ningbo,China.

14.ReinforcementLearningAlgorithmBased-onPolicyIterationforSolvingRTDP.2006.8,ISAI’2006,Beijing,China.

15.U-Clustering:AReinforcementLearningAlgorithmBasedonUtilityClustering.JournalofComputerEngineeringandApplications,2005,No.20.

16.ReinforcementLearningForgettingAlgorithmBasedonDynamicProgramming.JournalofComputerEngineeringandApplications,2004,No.20.

17.TheDynamicMergeReinforcementLearningAlgorithmforSolvingPOMDP.JournalofComputerEngineering.2005,11.

18.Multi-TimeScaleRisk-SensitiveHierarchicalStructureControlProblem.DCABES2006,Hangzhou,China,2006.10.

19.UtilityClusteringforReinforcementLearningwithPartialObservability.InProceedingsofConferenceofChineseIntelligenceAutomatization,HongKong,China,2003.(IJCAI03).

20.AverageAsymptoticTemporalDifferenceLearningForgettingAlgorithmonEligibilityTrace,JournalofChangshaUniversityofElectricPower,2003(4).

21.NonlinearControlBasedonQ-learningAlgorithms.JournalofChangshaUniversityofElectricPower,Val.18,No.1,2003(1).

22.ARelativeValueIterationQ-LearningAlgorithmandItsConvergenceBased-onFiniteSamples.JournalofComputerResearchandDevelopment.Sept.2002,Vol.39,No.9.

23.OptimalityCostRelativeValueIterationQ-LearningAlgorithmBasedonFiniteSamples.JournalofComputerEngineeringandApplications,2002,No.14.

24.GeneralizeAverageAlgorithmforReinforcementLearningItsConvergence.JournalofComputerEngineeringandApplications,2002,No.20.

25.ReinforcementLearningAlgorithmBasedonaverageCostOptimizationforEachStage.JournalofComputerApplications,Val.22,No.4,2002(4).

26.ClassificationforUn-labeledContextBasedonMaximumExpectationLearningAlgorithm.Proceedingsof14thCDC(AnnulConferenceofControlandDecision,China).

27.ATD(lambda)LearningForgettingAlgorithm.Proceedingsof4thMachineandElectricEngineeringAssociationofHunan,China,Aug.2002.

28.DistributedReal-timeSystemforElectricPowerEnterpriseBasedonIntranet/Web.JournalofApplicationsoftheComputerSystems,2002(4).

29.TheUniformofSecurityPolicyinDistributedSystem.JournalofInformationEngineeringUniversity,2001.(ProceedingsofAnnualConferenceofChineseNetworksandInformationSecurity,Zhengzhou,China,2001).

30.DesignofDistributedRealTimeDatabaseSystemBasedonJDBC/Web.JournalofComputerDevelopmentandApplications.2001,No.36.

31.TheApplicationDelphiMulti-threadforDistributedRealtimeMulti-taskSystem.JournalofChangshaUniversityofElectricPower,Val.15,No.1,2001(1).

32.ComparingARPofIPv4withNeighborDiscoveryProtocolofIPv6.JournalofChangshaUniversityofElectricPower,Val.16,No.1,2001(1).

33.StudyandApplicationofDistributedRealTimeMultimediaDatabase.JournalofChangshaUniversityofElectricPower,Val.16,No.2,2001(2).

34.TheDesignofReal-timeMonitorDatabaseSystemBasedonDistributedHeterogeneousNetworksEnvironment.JournalofChangshaUniversityofElectricPower,Val.16,No.3,2001(3).

35.DistributedReal-timeMulti-taskSystemStudyandApplicationforMonitoringandSupervisinginElectricPowerPlant.Proceedingsof1stMachineandElectricEngineeringAssociationofHunan,China,Aug,1999.

36.ThePrinciplesandDesignMethodsforDomainServiceSystemofCampusNetworks.JournalofChangshaUniversityofElectricPower,Val.13,No.1,1998(1).

37.SecurityStudyforWindowsNTNetworkManagement.JournalofChangshaUniversityofElectricPower,Val.13,No.2,1998(2).

38.TheWeighedLorentzNormInequalityofGeneralizationMaximumOperator.AnnualofHunanMathematics,Val17,No.2,1997.

39.TheWeightedboundaryofOperatoranditsinterpolationonMixedLebesgueSpace.JournalofChangshaUniversityofElectricPower,Val.12,No.3,1997(3).

40.TheAlternativenessofNon-CommutativeandNon-CombinativeFractionalRing.JournalofChangshaUniversityofWaterResourcesandElectricPower,Val.8,No.2,1993(2).

41.TheCombinerTheoryofNon-CommutativeandNon-CombinativeFractionalRing.JournalofChangshaUniversityofWaterResourcesandElectricPower,Val.6,No.2,1991(2).

42.TheEquivalenceConditionsforReductionableElementsonComplexCommutativeBanachAlgebra.JournalofChangshaUniversityofWaterResourcesandElectricPower,Val.5,No.1,1990(1).

43.F-SetonUnitsquare-cubeundern-DimensionEuclidSpace.JournalofChangshaUniversityofWaterResourcesandElectricPower,Val.5,No.2,1990(2).