default search action
Martha White
Person information
- affiliation: University of Alberta, Edmonton, Canada
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j18]Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White:
Investigating the properties of neural network representations in reinforcement learning. Artif. Intell. 330: 104100 (2024) - [j17]Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin J. Talvitie, Michael Bowling, Martha White:
Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models. J. Artif. Intell. Res. 80: 441-473 (2024) - [j16]Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White:
GVFs in the real world: making predictions online for water treatment. Mach. Learn. 113(8): 5151-5181 (2024) - [j15]Lingwei Zhu, Matthew Schlegel, Han Wang, Martha White:
Offline Reinforcement Learning via Tsallis Regularization. Trans. Mach. Learn. Res. 2024 (2024) - [c63]Vincent Liu, James R. Wright, Martha White:
Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning (Abstract Reprint). AAAI 2024: 22706 - [c62]Brett Daley, Martha White, Marlos C. Machado:
Averaging n-step Returns Reduces Variance in Reinforcement Learning. ICML 2024 - [c61]Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas:
Position: Benchmarking is Limited in Reinforcement Learning Research. ICML 2024 - [i84]Brett Daley, Martha White, Marlos C. Machado:
Compound Returns Reduce Variance in Reinforcement Learning. CoRR abs/2402.03903 (2024) - [i83]Hugo Silva, Martha White:
What to Do When Your Discrete Optimization Is the Size of a Neural Network? CoRR abs/2402.10339 (2024) - [i82]Ehsan Imani, Kai Luedemann, Sam Scholnick-Hughes, Esraa Elelimy, Martha White:
Investigating the Histogram Loss in Regression. CoRR abs/2402.13425 (2024) - [i81]Golnaz Mesbahi, Olya Mastikhina, Parham Mohammad Panahi, Martha White, Adam White:
Tuning for the Unknown: Revisiting Evaluation Strategies for Lifelong RL. CoRR abs/2404.02113 (2024) - [i80]Kevin Roice, Parham Mohammad Panahi, Scott M. Jordan, Adam White, Martha White:
A New View on Planning in Online Reinforcement Learning. CoRR abs/2406.01562 (2024) - [i79]Brett Daley, Marlos C. Machado, Martha White:
Demystifying the Recency Heuristic in Temporal-Difference Learning. CoRR abs/2406.12284 (2024) - [i78]Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas:
Position: Benchmarking is Limited in Reinforcement Learning Research. CoRR abs/2406.16241 (2024) - [i77]Parham Mohammad Panahi, Andrew Patterson, Martha White, Adam White:
Investigating the Interplay of Prioritized Replay and Generalization. CoRR abs/2407.09702 (2024) - [i76]Andrew Patterson, Samuel Neumann, Raksha Kumaraswamy, Martha White, Adam White:
The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning. CoRR abs/2407.18840 (2024) - [i75]Lingwei Zhu, Haseeb Shah, Han Wang, Martha White:
q-exponential family for policy optimization. CoRR abs/2408.07245 (2024) - [i74]Esraa Elelimy, Adam White, Michael Bowling, Martha White:
Real-Time Recurrent Learning using Trace Units in Reinforcement Learning. CoRR abs/2409.01449 (2024) - 2023
- [j14]Vincent Liu, James R. Wright, Martha White:
Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning. J. Artif. Intell. Res. 77: 71-101 (2023) - [j13]Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White:
Off-Policy Actor-Critic with Emphatic Weightings. J. Mach. Learn. Res. 24: 146:1-146:63 (2023) - [j12]Khurram Javed, Haseeb Shah, Richard S. Sutton, Martha White:
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks. J. Mach. Learn. Res. 24: 256:1-256:34 (2023) - [j11]Andrew Patterson, Victor Liao, Martha White:
Robust Losses for Learning Value Functions. IEEE Trans. Pattern Anal. Mach. Intell. 45(5): 6157-6167 (2023) - [j10]Erfan Miahi, Revan MacQueen, Alex Ayoub, Abbas Masoumzadeh, Martha White:
Resmax: An Alternative Soft-Greedy Operator for Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [j9]Matthew Schlegel, Volodymyr Tkachuk, Adam M. White, Martha White:
Investigating Action Encodings in Recurrent Neural Networks in Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [c60]Vincent Liu, Yash Chandak, Philip S. Thomas, Martha White:
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments. AISTATS 2023: 5474-5492 - [c59]Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White:
Measuring and Mitigating Interference in Reinforcement Learning. CoLLAs 2023: 781-795 - [c58]Samuel Neumann, Sungsu Lim, Ajin George Joseph, Yangchen Pan, Adam White, Martha White:
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement. ICLR 2023 - [c57]Chenjun Xiao, Han Wang, Yangchen Pan, Adam White, Martha White:
The In-Sample Softmax for Offline Reinforcement Learning. ICLR 2023 - [c56]Brett Daley, Martha White, Christopher Amato, Marlos C. Machado:
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning. ICML 2023: 6818-6835 - [c55]Lingwei Zhu, Zheng Chen, Matthew Schlegel, Martha White:
General Munchausen Reinforcement Learning with Tsallis Kullback-Leibler Divergence. NeurIPS 2023 - [i73]Brett Daley, Martha White, Christopher Amato, Marlos C. Machado:
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning. CoRR abs/2301.11321 (2023) - [i72]Lingwei Zhu, Zheng Chen, Takamitsu Matsubara, Martha White:
Generalized Munchausen Reinforcement Learning using Tsallis KL Divergence. CoRR abs/2301.11476 (2023) - [i71]Khurram Javed, Haseeb Shah, Richard S. Sutton, Martha White:
Online Real-Time Recurrent Learning Using Sparse Connections and Selective Learning. CoRR abs/2302.05326 (2023) - [i70]Vincent Liu, Yash Chandak, Philip S. Thomas, Martha White:
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments. CoRR abs/2302.11725 (2023) - [i69]Chenjun Xiao, Han Wang, Yangchen Pan, Adam White, Martha White:
The In-Sample Softmax for Offline Reinforcement Learning. CoRR abs/2302.14372 (2023) - [i68]Andrew Patterson, Samuel Neumann, Martha White, Adam White:
Empirical Design in Reinforcement Learning. CoRR abs/2304.01315 (2023) - [i67]James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas:
Coagent Networks: Generalized and Scaled. CoRR abs/2305.09838 (2023) - [i66]Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White:
Measuring and Mitigating Interference in Reinforcement Learning. CoRR abs/2307.04887 (2023) - [i65]Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White:
GVFs in the Real World: Making Predictions Online for Water Treatment. CoRR abs/2312.01624 (2023) - [i64]Vincent Liu, Prabhat Nagarajan, Andrew Patterson, Martha White:
When is Offline Policy Selection Sample Efficient for Reinforcement Learning? CoRR abs/2312.02355 (2023) - 2022
- [j8]Andrew Patterson, Adam White, Martha White:
A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning. J. Mach. Learn. Res. 23: 145:1-145:61 (2022) - [j7]Alan Chan, Hugo Silva, Sungsu Lim, Tadashi Kozuno, A. Rupam Mahmood, Martha White:
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences. J. Mach. Learn. Res. 23: 253:1-253:79 (2022) - [j6]Ehsan Imani, Wei Hu, Martha White:
Representation Alignment in Neural Networks. Trans. Mach. Learn. Res. 2022 (2022) - [j5]Han Wang, Archit Sakhadeo, Adam M. White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White:
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL. Trans. Mach. Learn. Res. 2022 (2022) - [c54]Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, Rupam Mahmood:
An Alternate Policy Gradient Estimator for Softmax Policies. AISTATS 2022: 6630-6689 - [c53]Kirby Banman, Liam Peet-Pare, Nidhi Hegde, Alona Fyshe, Martha White:
Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum. ICLR 2022 - [c52]Samuele Tosatto, Andrew Patterson, Martha White, Rupam Mahmood:
A Temporal-Difference Approach to Policy Gradient Estimation. ICML 2022: 21609-21632 - [c51]Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand, Martha White, Hengshuai Yao, Mohsen Rohani, Jun Luo:
Understanding and mitigating the limitations of prioritized experience replay. UAI 2022: 1561-1571 - [i63]Samuele Tosatto, Andrew Patterson, Martha White, A. Rupam Mahmood:
A Temporal-Difference Approach to Policy Gradient Estimation. CoRR abs/2202.02396 (2022) - [i62]Matthew McLeod, Chunlok Lo, Matthew Schlegel, Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White:
Continual Auxiliary Task Learning. CoRR abs/2202.11133 (2022) - [i61]Kirby Banman, Liam Peet-Pare, Nidhi Hegde, Alona Fyshe, Martha White:
Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum. CoRR abs/2203.11992 (2022) - [i60]Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White:
Investigating the Properties of Neural Network Representations in Reinforcement Learning. CoRR abs/2203.15955 (2022) - [i59]Andrew Patterson, Victor Liao, Martha White:
Robust Losses for Learning Value Functions. CoRR abs/2205.08464 (2022) - [i58]Han Wang, Archit Sakhadeo, Adam White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White:
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL. CoRR abs/2205.08716 (2022) - [i57]Chunlok Lo, Gabor Mihucz, Adam White, Farzane Aminmansour, Martha White:
Goal-Space Planning with Subgoal Models. CoRR abs/2206.02902 (2022) - 2021
- [j4]Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson, Adam White, Martha White:
General Value Function Networks. J. Artif. Intell. Res. 70: 497-543 (2021) - [j3]Sebastian Höfer, Kostas E. Bekris, Ankur Handa, Juan Camilo Gamboa, Melissa Mozifian, Florian Golemo, Christopher G. Atkeson, Dieter Fox, Ken Goldberg, John Leonard, C. Karen Liu, Jan Peters, Shuran Song, Peter Welinder, Martha White:
Sim2Real in Robotics and Automation: Applications and Challenges. IEEE Trans Autom. Sci. Eng. 18(2): 398-400 (2021) - [c50]Yangchen Pan, Kirby Banman, Martha White:
Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online. ICLR 2021 - [c49]Matthew McLeod, Chunlok Lo, Matthew Schlegel, Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White:
Continual Auxiliary Task Learning. NeurIPS 2021: 12549-12562 - [c48]Dhawal Gupta, Gabor Mihucz, Matthew Schlegel, James E. Kostas, Philip S. Thomas, Martha White:
Structural Credit Assignment in Neural Networks using Reinforcement Learning. NeurIPS 2021: 30257-30270 - [i56]Khurram Javed, Martha White, Richard S. Sutton:
Scalable Online Recurrent Learning Using Columnar Neural Networks. CoRR abs/2103.05787 (2021) - [i55]Andrew Patterson, Adam White, Sina Ghiassian, Martha White:
A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning. CoRR abs/2104.13844 (2021) - [i54]Qingfeng Lan, Luke Kumar, Martha White, Alona Fyshe:
Predictive Representation Learning for Language Modeling. CoRR abs/2105.14214 (2021) - [i53]Alan Chan, Hugo Silva, Sungsu Lim, Tadashi Kozuno, A. Rupam Mahmood, Martha White:
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences. CoRR abs/2107.08285 (2021) - [i52]Vincent Liu, James R. Wright, Martha White:
Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning. CoRR abs/2111.08066 (2021) - [i51]Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White:
Off-Policy Actor-Critic with Emphatic Weightings. CoRR abs/2111.08172 (2021) - [i50]Ehsan Imani, Wei Hu, Martha White:
Understanding Feature Transfer Through Representation Alignment. CoRR abs/2112.07806 (2021) - [i49]Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, A. Rupam Mahmood:
An Alternate Policy Gradient Estimator for Softmax Policies. CoRR abs/2112.11622 (2021) - 2020
- [j2]Cam Linke, Nadia M. Ady, Martha White, Thomas Degris, Adam White:
Adapting Behavior via Intrinsic Reward: A Survey and Empirical Study. J. Artif. Intell. Res. 69: 1287-1332 (2020) - [c47]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Rewards. AAMAS 2020: 1215-1223 - [c46]Maryam Hashemzadeh, Greta Kaufeld, Martha White, Andrea E. Martin, Alona Fyshe:
From Language to Language-ish: How Brain-Like is an LSTM's Representation of Atypical Language Stimuli? EMNLP (Findings) 2020: 645-656 - [c45]Qingfeng Lan, Yangchen Pan, Alona Fyshe, Martha White:
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning. ICLR 2020 - [c44]Somjit Nath, Vincent Liu, Alan Chan, Xin Li, Adam White, Martha White:
Training Recurrent Neural Networks Online by Learning Explicit State Variables. ICLR 2020 - [c43]Zaheer Abbas, Samuel Sokota, Erin Talvitie, Martha White:
Selective Dyna-Style Planning Under Limited Model Capacity. ICML 2020: 1-10 - [c42]Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas:
Optimizing for the Future in Non-Stationary MDPs. ICML 2020: 1414-1425 - [c41]Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White:
Gradient Temporal-Difference Learning with Regularized Corrections. ICML 2020: 3524-3534 - [c40]Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas:
Towards Safe Policy Improvement for Non-Stationary MDPs. NeurIPS 2020 - [c39]Yangchen Pan, Ehsan Imani, Amir-massoud Farahmand, Martha White:
An implicit function learning approach for parametric modal regression. NeurIPS 2020 - [i48]Yangchen Pan, Ehsan Imani, Martha White, Amir-massoud Farahmand:
An implicit function learning approach for parametric modal regression. CoRR abs/2002.06195 (2020) - [i47]Qingfeng Lan, Yangchen Pan, Alona Fyshe, Martha White:
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning. CoRR abs/2002.06487 (2020) - [i46]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Reward. CoRR abs/2005.04912 (2020) - [i45]Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas:
Optimizing for the Future in Non-Stationary MDPs. CoRR abs/2005.08158 (2020) - [i44]Taher Jafferjee, Ehsan Imani, Erin Talvitie, Martha White, Michael Bowling:
Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models. CoRR abs/2006.04363 (2020) - [i43]Khurram Javed, Martha White, Yoshua Bengio:
Learning Causal Models Online. CoRR abs/2006.07461 (2020) - [i42]Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White:
Gradient Temporal-Difference Learning with Regularized Corrections. CoRR abs/2007.00611 (2020) - [i41]Zaheer Abbas, Samuel Sokota, Erin J. Talvitie, Martha White:
Selective Dyna-style Planning Under Limited Model Capacity. CoRR abs/2007.02418 (2020) - [i40]Vincent Liu, Adam White, Hengshuai Yao, Martha White:
Towards a practical measure of interference for reinforcement learning. CoRR abs/2007.03807 (2020) - [i39]Jincheng Mei, Yangchen Pan, Martha White, Amir-massoud Farahmand, Hengshuai Yao:
Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities. CoRR abs/2007.09569 (2020) - [i38]Maryam Hashemzadeh, Greta Kaufeld, Martha White, Andrea E. Martin, Alona Fyshe:
From Language to Language-ish: How Brain-Like is an LSTM's Representation of Nonsensical Language Stimuli? CoRR abs/2010.07435 (2020) - [i37]Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas:
Towards Safe Policy Improvement for Non-Stationary MDPs. CoRR abs/2010.12645 (2020) - [i36]Sebastian Höfer, Kostas E. Bekris, Ankur Handa, Juan Camilo Gamboa Higuera, Florian Golemo, Melissa Mozifian, Christopher G. Atkeson, Dieter Fox, Ken Goldberg, John Leonard, C. Karen Liu, Jan Peters, Shuran Song, Peter Welinder, Martha White:
Perspectives on Sim2Real Transfer for Robotics: A Summary of the R: SS 2020 Workshop. CoRR abs/2012.03806 (2020)
2010 – 2019
- 2019
- [c38]Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White:
Meta-Descent for Online, Continual Prediction. AAAI 2019: 3943-3950 - [c37]Vincent Liu, Raksha Kumaraswamy, Lei Le, Martha White:
The Utility of Sparse Representations for Control in Reinforcement Learning. AAAI 2019: 4384-4391 - [c36]Wesley Chung, Somjit Nath, Ajin Joseph, Martha White:
Two-Timescale Networks for Nonlinear Value Function Approximation. ICLR (Poster) 2019 - [c35]Yangchen Pan, Hengshuai Yao, Amir-massoud Farahmand, Martha White:
Hill Climbing on Value Estimates for Search-control in Dyna. IJCAI 2019: 3209-3215 - [c34]Yi Wan, Muhammad Zaheer, Adam White, Martha White, Richard S. Sutton:
Planning with Expectation Models. IJCAI 2019: 3649-3655 - [c33]Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White:
Importance Resampling for Off-policy Prediction. NeurIPS 2019: 1797-1807 - [c32]Khurram Javed, Martha White:
Meta-Learning Representations for Continual Learning. NeurIPS 2019: 1818-1828 - [c31]Farzane Aminmansour, Andrew Patterson, Lei Le, Yisu Peng, Daniel Mitchell, Franco Pestilli, Cesar F. Caiafa, Russell Greiner, Martha White:
Learning Macroscopic Brain Connectomes via Group-Sparse Factorization. NeurIPS 2019: 8847-8857 - [i35]Yi Wan, Muhammad Zaheer, Adam White, Martha White, Richard S. Sutton:
Planning with Expectation Models. CoRR abs/1904.01191 (2019) - [i34]Khurram Javed, Martha White:
Meta-Learning Representations for Continual Learning. CoRR abs/1905.12588 (2019) - [i33]Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White:
Importance Resampling for Off-policy Prediction. CoRR abs/1906.04328 (2019) - [i32]Yangchen Pan, Hengshuai Yao, Amir-massoud Farahmand, Martha White:
Hill Climbing on Value Estimates for Search-control in Dyna. CoRR abs/1906.07791 (2019) - [i31]Cam Linke, Nadia M. Ady, Martha White, Thomas Degris, Adam White:
Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study. CoRR abs/1906.07865 (2019) - [i30]Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White:
Meta-descent for Online, Continual Prediction. CoRR abs/1907.07751 (2019) - [i29]Khurram Javed, Hengshuai Yao, Martha White:
Is Fast Adaptation All You Need? CoRR abs/1910.01705 (2019) - 2018
- [c30]Ehsan Imani, Martha White:
Improving Regression Performance with Distributional Losses. ICML 2018: 2162-2171 - [c29]Yangchen Pan, Amir-massoud Farahmand, Martha White, Saleh Nabi, Piyush Grover, Daniel Nikovski:
Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control. ICML 2018: 3983-3992 - [c28]Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White:
Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains. IJCAI 2018: 4794-4800 - [c27]Ehsan Imani, Eric Graves, Martha White:
An Off-policy Policy Gradient Theorem Using Emphatic Weightings. NeurIPS 2018: 96-106 - [c26]Lei Le, Andrew Patterson, Martha White:
Supervised autoencoders: Improving generalization performance with unsupervised regularizers. NeurIPS 2018: 107-117 - [c25]Raksha Kumaraswamy, Matthew Schlegel, Adam White, Martha White:
Context-dependent upper-confidence bounds for directed exploration. NeurIPS 2018: 4784-4794 - [c24]Craig Sherstan, Dylan R. Ashley, Brendan Bennett, Kenny Young, Adam White, Martha White, Richard S. Sutton:
Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return. UAI 2018: 63-72 - [c23]Touqir Sajed, Wesley Chung, Martha White:
High-confidence error estimates for learned value functions. UAI 2018: 683-692 - [i28]Craig Sherstan, Brendan Bennett, Kenny Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton:
Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods. CoRR abs/1801.08287 (2018) - [i27]Ehsan Imani, Martha White:
Improving Regression Performance with Distributional Losses. CoRR abs/1806.04613 (2018) - [i26]Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White:
Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains. CoRR abs/1806.04624 (2018) - [i25]Yangchen Pan, Amir-massoud Farahmand, Martha White, Saleh Nabi, Piyush Grover, Daniel Nikovski:
Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control. CoRR abs/1806.06931 (2018) - [i24]Matthew Schlegel, Adam White, Andrew Patterson, Martha White:
General Value Function Networks. CoRR abs/1807.06763 (2018) - [i23]Touqir Sajed, Wesley Chung, Martha White:
High-confidence error estimates for learned value functions. CoRR abs/1808.09127 (2018) - [i22]Sungsu Lim, Ajin Joseph, Lei Le, Yangchen Pan, Martha White:
Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces. CoRR abs/1810.09103 (2018) - [i21]Sina Ghiassian, Andrew Patterson, Martha White, Richard S. Sutton, Adam White:
Online Off-policy Prediction. CoRR abs/1811.02597 (2018) - [i20]Vincent Liu, Raksha Kumaraswamy, Lei Le, Martha White:
The Utility of Sparse Representations for Control in Reinforcement Learning. CoRR abs/1811.06626 (2018) - [i19]Raksha Kumaraswamy, Matthew Schlegel, Adam White, Martha White:
Context-Dependent Upper-Confidence Bounds for Directed Exploration. CoRR abs/1811.06629 (2018) - [i18]Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc G. Bellemare, Doina Precup:
The Barbados 2018 List of Open Issues in Continual Learning. CoRR abs/1811.07004 (2018) - [i17]Ehsan Imani, Eric Graves, Martha White:
An Off-policy Policy Gradient Theorem Using Emphatic Weightings. CoRR abs/1811.09013 (2018) - [i16]Minghan Li, Tanli Zuo, Ruicheng Li, Martha White, Weishi Zheng:
Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling. CoRR abs/1812.00914 (2018) - 2017
- [c22]Shantanu Jain, Martha White, Predrag Radivojac:
Recovering True Classifier Performance in Positive-Unlabeled Learning. AAAI 2017: 2066-2072 - [c21]Yangchen Pan, Adam White, Martha White:
Accelerated Gradient Temporal Difference Learning. AAAI 2017: 2464-2470 - [c20]Matthew Schlegel, Yangchen Pan, Jiecao Chen, Martha White:
Adapting Kernel Representations Online Using Submodular Maximization. ICML 2017: 3037-3046 - [c19]Martha White:
Unifying Task Specification in Reinforcement Learning. ICML 2017: 3742-3750 - [c18]Lei Le, Raksha Kumaraswamy, Martha White:
Learning Sparse Representations in Reinforcement Learning with Sparse Coding. IJCAI 2017: 2067-2073 - [c17]Mahdi Karami, Martha White, Dale Schuurmans, Csaba Szepesvári:
Multi-view Matrix Factorization for Linear Dynamical System Estimation. NIPS 2017: 7092-7101 - [c16]Yangchen Pan, Erfan Sadeqi Azer, Martha White:
Effective sketching methods for value function approximation. UAI 2017 - [i15]Shantanu Jain, Martha White, Predrag Radivojac:
Recovering True Classifier Performance in Positive-Unlabeled Learning. CoRR abs/1702.00518 (2017) - [i14]Lei Le, Raksha Kumaraswamy, Martha White:
Learning Sparse Representations in Reinforcement Learning with Sparse Coding. CoRR abs/1707.08316 (2017) - [i13]Yangchen Pan, Erfan Sadeqi Azer, Martha White:
Effective sketching methods for value function approximation. CoRR abs/1708.01298 (2017) - 2016
- [j1]Richard S. Sutton, Ashique Rupam Mahmood, Martha White:
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning. J. Mach. Learn. Res. 17: 73:1-73:29 (2016) - [c15]Adam White, Martha White:
Investigating Practical Linear Temporal Difference Learning. AAMAS 2016: 494-502 - [c14]Martha White, Adam White:
A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning. AAMAS 2016: 557-565 - [c13]Clement Gehring, Yangchen Pan, Martha White:
Incremental Truncated LSTD. IJCAI 2016: 1505-1511 - [c12]Shantanu Jain, Martha White, Predrag Radivojac:
Estimating the class prior and posterior from noisy positives and unlabeled data. NIPS 2016: 2685-2693 - [i12]Shantanu Jain, Martha White, Michael W. Trosset, Predrag Radivojac:
Nonparametric semi-supervised learning of class proportions. CoRR abs/1601.01944 (2016) - [i11]Adam White, Martha White:
Investigating practical, linear temporal difference learning. CoRR abs/1602.08771 (2016) - [i10]Lei Le, Martha White:
Global optimization of factor models using alternating minimization. CoRR abs/1604.04942 (2016) - [i9]Shantanu Jain, Martha White, Predrag Radivojac:
Estimating the class prior and posterior from noisy positives and unlabeled data. CoRR abs/1606.08561 (2016) - [i8]Martha White, Adam White:
A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning. CoRR abs/1607.00446 (2016) - [i7]Martha White:
Unifying task specification in reinforcement learning. CoRR abs/1609.01995 (2016) - [i6]Yangchen Pan, Adam White, Martha White:
Accelerated Gradient Temporal Difference Learning. CoRR abs/1611.09328 (2016) - 2015
- [c11]Martha White, Junfeng Wen, Michael Bowling, Dale Schuurmans:
Optimal Estimation of Multivariate ARMA Models. AAAI 2015: 3080-3086 - [c10]Farzaneh Mirzazadeh, Martha White, András György, Dale Schuurmans:
Scalable Metric Learning for Co-Embedding. ECML/PKDD (1) 2015: 625-642 - [i5]Richard S. Sutton, Ashique Rupam Mahmood, Martha White:
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning. CoRR abs/1503.04269 (2015) - [i4]Ashique Rupam Mahmood, Huizhen Yu, Martha White, Richard S. Sutton:
Emphatic Temporal-Difference Learning. CoRR abs/1507.01569 (2015) - [i3]Clement Gehring, Martha White:
Incremental Truncated LSTD. CoRR abs/1511.08495 (2015) - 2013
- [c9]Joel Veness, Martha White, Michael Bowling, András György:
Partition Tree Weighting. DCC 2013: 321-330 - 2012
- [c8]Thomas Degris, Martha White, Richard S. Sutton:
Linear Off-Policy Actor-Critic. ICML 2012 - [c7]Martha White, Yaoliang Yu, Xinhua Zhang, Dale Schuurmans:
Convex Multi-view Subspace Learning. NIPS 2012: 1682-1690 - [c6]Martha White, Dale Schuurmans:
Generalized Optimal Reverse Prediction. AISTATS 2012: 1305-1313 - [i2]Thomas Degris, Martha White, Richard S. Sutton:
Off-Policy Actor-Critic. CoRR abs/1205.4839 (2012) - [i1]Joel Veness, Martha White, Michael Bowling, András György:
Partition Tree Weighting. CoRR abs/1211.0587 (2012) - 2011
- [c5]Xinhua Zhang, Yaoliang Yu, Martha White, Ruitong Huang, Dale Schuurmans:
Convex Sparse Coding, Subspace Learning, and Semi-Supervised Extensions. AAAI 2011: 567-573 - 2010
- [c4]Martha White, Adam White:
Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains. NIPS 2010: 2433-2441 - [c3]Yaoliang Yu, Min Yang, Linli Xu, Martha White, Dale Schuurmans:
Relaxed Clipping: A Global Training Method for Robust Regression and Classification. NIPS 2010: 2532-2540
2000 – 2009
- 2009
- [c2]Linli Xu, Martha White, Dale Schuurmans:
Optimal reverse prediction: a unified perspective on supervised, unsupervised and semi-supervised learning. ICML 2009: 1137-1144 - [c1]Martha White, Michael H. Bowling:
Learning a Value Analysis Tool for Agent Evaluation. IJCAI 2009: 1976-1981
Coauthor Index
aka: Adam M. White
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:17 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint