Memristor-based hardware accelerators for artificial intelligence
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Strubell, E., Ganesh, A. & McCallum, A. Energy and policy considerations for deep learning in NLP. In Proc. 57th Annual Meeting of the Association for Computational Linguistics 3645–3650 (ACL, 2019).
Merolla, P. A. et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345, 668–673 (2014). This paper reports the design and testing of TrueNorth, a complementary metal-oxide-semiconductor-based neuromorphic chip.
Modha, D. S. et al. Neural inference at the frontier of energy, space, and time. Science 382, 329–335 (2023). This paper reports the design and testing of NorthPole, a complementary metal-oxide-semiconductor-based near-memory computing chip.
Hinton, G. The forward–forward algorithm: some preliminary investigations. Preprint at arXiv https://doi.org/10.48550/arXiv.2212.13345 (2022).
Chua, L. O. Memristor — the missing circuit element. IEEE Trans. Circ. Theory 18, 507–519 (1971). This paper proposes the memristor concept.
Strukov, D. B., Snider, G. S., Stewart, D. R. & Williams, R. S. The missing memristor found. Nature 453, 80–83 (2008). This paper connects the metal oxide-based resistance switches with the memristor concept.
Yang, J. J. et al. Memristive switching mechanism for metal/oxide/metal nanodevices. Nat. Nanotechnol. 3, 429–433 (2008).
Yang, J. J., Strukov, D. B. & Stewart, D. R. Memristive devices for computing. Nat. Nanotechnol. 8, 13–24 (2013).
Prezioso, M. et al. Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature 521, 61–64 (2015). This paper reports the experimental implementation of training and inference in a 12 × 12 array of TiO2 memristors.
Wang, Z. et al. Resistive switching materials for information processing. Nat. Rev. Mater. 5, 173–195 (2020).
Goswami, S. et al. Decision trees within a molecular memristor. Nature 597, 51–56 (2021).
Pi, S. et al. Memristor crossbar arrays with 6-nm half-pitch and 2-nm critical dimension. Nat. Nanotechnol. 14, 35–39 (2019). This paper reports the scaling of memristors in a crossbar array down to 2 nm in size.
Sarwat, S. G., Kersting, B., Moraitis, T., Jonnalagadda, V. P. & Sebastian, A. Phase-change memtransistive synapses for mixed-plasticity neural computations. Nat. Nanotechnol. 17, 507–513 (2022).
Milano, G. et al. In materia reservoir computing with a fully memristive architecture based on self-organizing nanowire networks. Nat. Mater. 21, 195–202 (2022).
Rao, M. et al. Thousands of conductance levels in memristors integrated on CMOS. Nature 615, 823–829 (2023). This paper reports 2,048 conductance levels achieved in foundry-fabricated memristors, the highest to date.
Onen, M. et al. Nanosecond protonic programmable resistors for analog deep learning. Science 377, 539–543 (2022).
Yao, P. et al. Face classification using electronic synapses. Nat. Commun. 8, 1–8 (2017).
Li, C. et al. Long short-term memory networks in memristor crossbar arrays. Nat. Mach. Intell. 1, 49–57 (2019).
Liang, X. et al. Rotating neurons for all-analog implementation of cyclic reservoir computing. Nat. Commun. 13, 1549 (2022).
Xia, Q. & Yang, J. J. Memristive crossbar arrays for brain-inspired computing. Nat. Mater. 18, 309–323 (2019).
Lanza, M. et al. Memristive technologies for data storage, computation, encryption, and radio-frequency communication. Science 376, eabj9979 (2022).
Wang, T. et al. Reconfigurable neuromorphic memristor network for ultralow-power smart textile electronics. Nat. Commun. 13, 7432 (2022).
Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).
Chen, W.-H. et al. CMOS-integrated memristive non-volatile computing-in-memory for AI edge processors. Nat. Electron. 2, 420–428 (2019).
Xue, C.-X. et al. A CMOS-integrated compute-in-memory macro based on resistive random-access memory for AI edge devices. Nat. Electron. 4, 81–90 (2020).
Hung, J.-M. et al. A four-megabit compute-in-memory macro with eight-bit precision based on CMOS and resistive random-access memory for AI edge devices. Nat. Electron. 4, 921–930 (2021).
Le Gallo, M. et al. A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference. Nat. Electron. 6, 680–693 (2023). This paper reports a 64-core hardware accelerator based on phase-change memory with on-chip communication networks.
Ambrogio, S. et al. An analog-AI chip for energy-efficient speech recognition and transcription. Nature 620, 768–775 (2023). This paper reports an analog-AI chip with 35 million phase-change memory devices across 34 tiles.
Zhang, W. et al. Edge learning using a fully integrated neuro-inspired memristor chip. Science 381, 1205–1211 (2023).
Kim, H., Mahmoodi, M. R., Nili, H. & Strukov, D. B. 4K-memristor analog-grade passive crossbar circuit. Nat. Commun. 12, 5198 (2021). One of the largest analogue passive memristor arrays (64 × 64) for pattern classification.
Li, C. et al. Three-dimensional crossbar arrays of self-rectifying Si/SiO2/Si memristors. Nat. Commun. 8, 15666 (2017).
Wu, C., Kim, T. W., Choi, H. Y., Strukov, D. B. & Yang, J. J. Flexible three-dimensional artificial synapse networks with correlated learning and trainable memory capability. Nat. Commun. 8, 752 (2017).
Li, C. et al. Analogue signal and image processing with large memristor crossbars. Nat. Electron. 1, 52–59 (2018). This paper reports the analogue vector–matrix multiplication in a 128 × 64 1T1R crossbar, the largest at the time of publication.
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).
Wan, W. et al. A compute-in-memory chip based on resistive random-access memory. Nature 608, 504–512 (2022).
Ye, W. et al. A 28-nm RRAM computing-in-memory macro using weighted hybrid 2T1R cell array and reference subtracting sense amplifier for AI edge inference. IEEE J. Solid-State Circuits 58, 2839–2850 (2023).
Liu, Q. et al. 33.2 A fully integrated analog ReRAM based 78.4TOPS/W compute-in-memory chip with fully parallel MAC computing. In 2020 IEEE International Solid-State Circuits Conference 500–502 (IEEE, 2020).
Li, H. et al. SAPIENS: a 64-kb RRAM-based non-volatile associative memory for one-shot learning and inference at the edge. IEEE Trans. Electron. Devices 68, 6637–6643 (2021).
Choi, B. J. et al. Trilayer tunnel selectors for memristor memory cells. Adv. Mater. 28, 356–362 (2016).
Midya, R. et al. Anatomy of Ag/Hafnia‐based selectors with 1010 nonlinearity. Adv. Mater. https://doi.org/10.1002/adma.201604457 (2017).
Rao, M. et al. Timing selector: using transient switching dynamics to solve the sneak path issue of crossbar arrays. Small Sci. 2, 2100072 (2022).
Cai, F. et al. A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations. Nat. Electron. 2, 290–299 (2019).
Yu, S., Jiang, H., Huang, S., Peng, X. & Lu, A. Compute-in-memory chips for deep learning: recent trends and prospects. IEEE Circ. Syst. Mag. 21, 31–56 (2021).
Hung, J.-M. et al. 8-b Precision 8-Mb ReRAM compute-in-memory macro using direct-current-free time-domain readout scheme for AI edge devices. IEEE J. Solid-State Circuits 58, 303–315 (2023).
Huang, W.-H. et al. A nonvolatile Al-edge processor with 4MB SLC-MLC hybrid-mode ReRAM compute-in-memory macro and 51.4-251TOPS/W. In 2023 IEEE International Solid-State Circuits Conference 15–17 (IEEE, 2023).
Boybat, I. et al. Neuromorphic computing with multi-memristive synapses. Nat. Commun. 9, 2514 (2018).
Xia, Q. et al. Memristor−CMOS hybrid integrated circuits for reconfigurable logic. Nano Lett. 9, 3640–3645 (2009). This paper reports the first integration of memristors with foundry-made complementary metal-oxide-semiconductor circuitry.
Gong, N. et al. Deep learning acceleration in 14 nm CMOS compatible ReRAM array: device, material and algorithm co-optimization. In 2022 International Electron Devices Meeting 33.7.1–33.7.4 (IEEE, 2022).
Berdan, R. et al. Low-power linear computation using nonlinear ferroelectric tunnel junction memristors. Nat. Electron. 3, 259–266 (2020).
Karunaratne, G. et al. In-memory hyperdimensional computing. Nat. Electron. 3, 327–337 (2020).
Zidan, M. A. et al. A general memristor-based partial differential equation solver. Nat. Electron. 1, 411–420 (2018).
Li, C. et al. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat. Commun. 9, 7–14 (2018). This paper reports linear and symmetric weight updating and demonstrates that a memory array with certain level of defects and variation can still be a good artificial intelligence computing engine.
Wang, Z. et al. In situ training of feed-forward and recurrent convolutional memristor networks. Nat. Mach. Intell. 1, 434–442 (2019).
Wang, Z. et al. Reinforcement learning with analogue memristor arrays. Nat. Electron. 2, 115–124 (2019).
Li, Y. et al. Memristive field‐programmable analog arrays for analog computing. Adv. Mater. 35, e2206648 (2023).
Pedretti, G. et al. Tree-based machine learning performed in-memory with memristive analog CAM. Nat. Commun. 12, 5806 (2021).
Le Gallo, M. et al. Mixed-precision in-memory computing. Nat. Electron. 1, 246–253 (2018).
Sheridan, P. M. et al. Sparse coding with memristor networks. Nat. Nanotechnol. 12, 784–789 (2017).
Wang, C. et al. Scalable massively parallel computing using continuous-time data representation in nanoscale crossbar array. Nat. Nanotechnol. 16, 1079–1085 (2021).
Zhao, H. et al. Energy-efficient high-fidelity image reconstruction with memristor arrays for medical diagnosis. Nat. Commun. 14, 2276 (2023).
Wang, C. et al. Parallel in-memory wireless computing. Nat. Electron. 6, 381–389 (2023).
Zhu, X., Wang, Q. & Lu, W. D. Memristor networks for real-time neural activity analysis. Nat. Commun. 11, 2439 (2020).
Du, C. et al. Reservoir computing using dynamic memristors for temporal information processing. Nat. Commun. 8, 2204 (2017). This paper reports the early demonstration of reservoir computing with memristive crossbar arrays.
Zhong, Y. et al. A memristor-based analogue reservoir computing system for real-time and power-efficient signal processing. Nat. Electron. 5, 672–681 (2022).
Moon, J. et al. Temporal data classification and forecasting using a memristor-based reservoir computing system. Nat. Electron. 2, 480–487 (2019).
Nowshin, F., Huang, Y., Sarkar, Md. R., Xia, Q. & Yi, Y. MERRC: a memristor-enabled reconfigurable low-power reservoir computing architecture at the edge. IEEE Trans. Circuits Syst. I Regul. Pap. 71, 174–186 (2024).
Wang, Z. et al. Fully memristive neural networks for pattern classification with unsupervised learning. Nat. Electron. 1, 137–145 (2018).
Cai, F. et al. Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electron. 3, 409–418 (2020).
Huo, Q. et al. A computing-in-memory macro based on three-dimensional resistive random-access memory. Nat. Electron. 5, 469–477 (2022).
Lin, P. et al. Three-dimensional memristor circuits as complex neural networks. Nat. Electron. 3, 225–232 (2020). This paper explains, to our knowledge, the first 3D memristive crossbar array for parallel convolution operations.
Li, Y. et al. Monolithic three-dimensional integration of RRAM-based hybrid memory architecture for one-shot learning. Nat. Commun. 14, 7140 (2023).
Du, Y. et al. Monolithic 3D integration of analog RRAM‐based computing‐in‐memory and sensor for energy‐efficient near‐sensor computing. Adv. Mater. https://doi.org/10.1002/adma.202302658 (2023).
Choi, C. et al. Reconfigurable heterogeneous integration using stackable chips with embedded artificial intelligence. Nat. Electron. 5, 386–393 (2022).
Lin, P. & Xia, Q. Three-dimensional hybrid circuits: the future of neuromorphic computing hardware. Nano Express 2, 031003 (2021).
Bayat, F. M. et al. Implementation of multilayer perceptron network with highly uniform passive memristive crossbar circuits. Nat. Commun. 9, 2331 (2018).
Sahay, S., Bavandpour, M., Mahmoodi, M. R. & Strukov, D. Energy-efficient moderate precision time-domain mixed-signal vector-by-matrix multiplier exploiting 1T–1R arrays. IEEE J. Explor. Solid-State Comput. Devices Circuits 6, 18–26 (2020).
Freye, F. et al. Memristive devices for time domain compute-in-memory. IEEE J. Explor. Solid-State Comput. Devices Circuits 8, 119–127 (2022).
Xue, C.-X. et al. Embedded 1-Mb ReRAM-based computing-in-memory macro with multibit input and weight for CNN-based AI edge processors. IEEE J. Solid-State Circuits 55, 203–215 (2020).
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).
Kiani, F., Yin, J., Wang, Z., Yang, J. J. & Xia, Q. A fully hardware-based memristive multilayer neural network. Sci. Adv. 7, 4801 (2021).
Prabhu, K. et al. CHIMERA: a 0.92-TOPS, 2.2-TOPS/W edge AI accelerator with 2-MByte on-chip foundry resistive RAM for efficient training and inference. IEEE J. Solid-State Circuits 57, 1013–1026 (2022).
Yoon, J.-H. et al. A 40-nm 118.44-TOPS/W voltage-sensing compute-in-memory RRAM macro with write verification and multi-bit encoding. IEEE J. Solid-State Circuits 57, 845–857 (2022).
Gong, N. et al. Signal and noise extraction from analog memory elements for neuromorphic computing. Nat. Commun. 9, 2102 (2018).
Mochida, R. et al. A 4 M synapses integrated analog ReRAM based 66.5 TOPS/W neural-network processor with cell current controlled writing and flexible network architecture. In 2018 IEEE Symposium on VLSI Technology 175–176 (IEEE, 2018).
Khwa, W.-S. et al. A 40-nm, 2M-cell, 8b-precision, hybrid SLC-MLC PCM computing-in-memory macro with 20.5–65.0 TOPS/W for tiny-Al edge devices. In 2022 IEEE International Solid-State Circuits Conference 1–3 (IEEE, 2022).
Xue, C.-X. et al. 24.1 A 1 Mb multibit ReRAM computing-in-memory macro with 14.6 ns parallel MAC computing time for CNN based AI edge processors. In 2019 IEEE International Solid-State Circuits Conference 388–390 (IEEE, 2019). This paper reports one of the earliest memristor macros supporting convolutional neural network operations using multibit input/weight.
Xue, C.-X. et al. 15.4 A 22 nm 2 Mb ReRAM compute-in-memory macro with 121-28TOPS/W for multibit MAC computing for tiny AI edge devices. In 2020 IEEE International Solid-State Circuits Conference 244–246 (IEEE, 2020).
Chen, W.-H. et al. A 65 nm 1 Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In 2018 IEEE International Solid-State Circuits Conference 494–496 (IEEE, 2018).
Yoon, J.-H. et al. 29.1 A 40 nm 64 kb 56.67TOPS/w read-disturb-tolerant compute-in-memory/digital RRAM macro with active-feedback-based read and in-situ write verification. In 2021 IEEE International Solid-State Circuits Conference 404–406 (IEEE, 2021).
Gokmen, T. & Haensch, W. Algorithm for training neural networks on resistive device arrays. Front. Neurosci. https://doi.org/10.3389/fnins.2020.00103 (2020).
Narayanan, P. et al. Fully on-chip MAC at 14 nm enabled by accurate row-wise programming of PCM-based weights and parallel vector-transport in duration-format. IEEE Trans. Electron. Devices 68, 6629–6636 (2021).
Mori, H. et al. A 4 nm 6163-TOPS/W/b 4790-TOPS/mm2/b SRAM based digital-computing-in-memory macro supporting bit-width flexibility and simultaneous MAC and weight update. In Digest of Technical Papers — IEEE International Solid-State Circuits Conference 132–134 (IEEE, 2023).
Yue, J. et al. A 28 nm 16.9-300TOPS/W computing-in-memory processor supporting floating-point NN inference/training with intensive-CIM sparse-digital architecture. In 2023 IEEE International Solid-State Circuits Conference 1–3 (IEEE, 2023).
Schuman, C. D. et al. Opportunities for neuromorphic computing algorithms and applications. Nat. Comput. Sci. 2, 10–19 (2022).
Yang, X., Wu, C., Li, M. & Chen, Y. Tolerating noise effects in processing‐in‐memory systems for neural networks: a hardware–software codesign perspective. Adv. Intell. Syst. 4, 2200029 (2022).
Chakraborty, I., Roy, D. & Roy, K. Technology aware training in memristive neuromorphic systems for nonideal synaptic crossbars. IEEE Trans. Emerg. Top. Comput. Intell. 2, 335–344 (2018).
Kariyappa, S. et al. Noise-resilient DNN: tolerating noise in PCM-based AI accelerators via noise-aware training. IEEE Trans. Electron. Devices 68, 4356–4362 (2021).
Joshi, V. et al. Accurate deep neural network inference using computational phase-change memory. Nat. Commun. 11, 2473 (2020).
Rasch, M. J. et al. Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators. Nat. Commun. 14, 5282 (2023).
Maheshwari, S. et al. Design flow for hybrid CMOS/memristor systems — part II: circuit schematics and layout. IEEE Trans. Circuits Syst. I Regul. Pap. 68, 4876–4888 (2021).
Gao, B. et al. Memristor-based analogue computing for brain-inspired sound localization with in situ training. Nat. Commun. 13, 2026 (2022).
Zhang, Q. et al. Sign backpropagation: an on-chip learning algorithm for analog RRAM neuromorphic computing systems. Neural Netw. 108, 217–223 (2018).
Dalgaty, T. et al. In situ learning using intrinsic memristor variability via Markov chain Monte Carlo sampling. Nat. Electron. 4, 151–161 (2021).
Yi, S., Kendall, J. D., Williams, R. S. & Kumar, S. Activity-difference training of deep neural networks using memristor crossbars. Nat. Electron. 6, 45–51 (2022).
Chi, P. et al. PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In Proc. 2016 43rd International Symposium on Computer Architecture, ISCA 2016, 27–39 (IEEE 2016). One of the early architecture designs for in-memory computing systems based on memristive crossbar arrays.
Sun, H. et al. Gibbon: efficient co-exploration of NN model and processing-in-memory architecture. In 2022 Design, Automation & Test in Europe Conference & Exhibition 867–872 (IEEE, 2022).
Zhao, S., Qu, S., Wang, Y. & Han, Y. ENASA: towards edge neural architecture search based on CIM acceleration. In 2023 Design, Automation & Test in Europe Conference & Exhibition 1–2 (IEEE, 2023).
Zhu, Z. et al. Processing-in-hierarchical-memory architecture for billion-scale approximate nearest neighbor search. In 2023 60th ACM/IEEE Design Automation Conference 1–6 (IEEE, 2023).
Zhu, Y. et al. PIM-HLS: an automatic hardware generation tool for heterogeneous processing-in-memory-based neural network accelerators. In 2023 60th ACM/IEEE Design Automation Conference 1–6 (IEEE, 2023).
Liu, F. et al. ERA-BS: boosting the efficiency of ReRAM-based PIM accelerator with fine-grained bit-level sparsity. IEEE Trans. Comput. https://doi.org/10.1109/TC.2023.3290869 (2023).
Chang, M. et al. A 73.53TOPS/W 14.74TOPS heterogeneous RRAM in-memory and SRAM near-memory SoC for hybrid frame and event-based target tracking. In 2023 IEEE International Solid-State Circuits Conference 426–428 (IEEE, 2023).
Jain, S. et al. A heterogeneous and programmable compute-in-memory accelerator architecture for analog-AI using dense 2-D mesh. IEEE Trans. Very Large Scale Integr. VLSI Syst. 31, 114–127 (2023).
Kvatinsky, S., Friedman, E. G., Kolodny, A. & Weiser, U. C. TEAM: ThrEshold Adaptive Memristor model. IEEE Trans. Circuits Syst. I Regul. Pap. 60, 211–221 (2013).
Chen, P. Y. & Yu, S. Compact modeling of RRAM devices and its applications in 1T1R and 1S1R array design. IEEE Trans. Electron. Devices 62, 4022–4028 (2015).
Zhuo, Y. et al. A dynamical compact model of diffusive and drift memristors for neuromorphic computing. Adv. Electron. Mater. https://doi.org/10.1002/aelm.202100696 (2022).
Liu, Y. et al. Compact reliability model of analog RRAM for computation-in-memory device-to-system codesign and benchmark. IEEE Trans. Electron. Devices 68, 2686–2692 (2021).
Lammie, C. & Azghadi, M. R. MemTorch: a simulation framework for deep memristive cross-bar architectures. In 2020 IEEE International Symposium on Circuits and Systems 1–5 (IEEE, 2020).
Chen, P. Y., Peng, X. & Yu, S. NeuroSim: a circuit-level macro model for benchmarking neuro-inspired architectures in online learning. IEEE Trans. Comput-Aided Des. Integr. Circuits Syst. 37, 3067–3080 (2018).
Zhu, Z. et al. MNSIM 2.0: a behavior-level modeling tool for processing-in-memory architectures. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 42, 4112–4125 (2023).
Rasch, M. J. et al. A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arrays. In 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (IEEE, 2021).
Le Gallo, M. et al. Using the IBM analog in-memory hardware acceleration kit for neural network training and inference. APL Mach. Learn. 1, 041102 (2023).
Xue, C.-X. et al. 16.1 A 22 nm 4 Mb 8b-precision ReRAM computing-in-memory macro with 11.91 to 195.7TOPS/W for tiny AI edge devices. In 2021 IEEE International Solid-State Circuits Conference 245–247 (IEEE, 2021).
Chan, V. et al. Yield methodology and heater process variation in phase change memory (PCM) technology for analog computing. IEEE Trans. Semicond. Manufact. 36, 327–331 (2023).
Mackin, C. et al. Optimised weight programming for analogue memory-based deep neural networks. Nat. Commun. 13, 3765 (2022).
Lanza, M., Molas, G. & Naveh, I. The gap between academia and industry in resistive switching research. Nat. Electron. 6, 260–263 (2023).
Chiu, Y. C. et al. A CMOS-integrated spintronic compute-in-memory macro for secure AI edge devices. Nat. Electron. 6, 534–543 (2023).
Spetalnick, S. D. et al. A 40 nm 64 kb 26.56TOPS/W 2.37 Mb/mm2 RRAM binary/compute-in-memory macro with 4.23× improvement in density and >75% use of sensing dynamic range. In 2022 IEEE International Solid-State Circuits Conference 1–3 (IEEE, 2022).