-
-
Djupdal, Asbjørn; Själander, Hans Magnus; Jahre, Magnus & Aunet, Snorre
(2023).
Minimizing the Energy Usage of Tiny RISC-V Cores.
-
Aunet, Snorre
(2023).
Ultra low voltage --/ low power technologies and building blocks for full custom integrated circuits - What ?, Why ?, How ?, Goals?
-
Nurmi, Jari; Aunet, Snorre & Saberkari, Alireza
(2023).
Guest Editorial Selected Papers From IEEE Nordic Circuits and Systems Conference (NorCAS) 2022. .
IEEE Transactions on Very Large Scale Integration (vlsi) Systems.
ISSN 1063-8210.
32(1),
p. 1–3.
doi:
10.1109/TVLSI.2023.3339268.
-
Aunet, Snorre
(2022).
Towards Sub-100 mV CMOS.
-
Su, Shengkai; Truong, Binh Duc; Aunet, Snorre & Le, Phu Cuong
(2021).
A reliable and wide-range tuning technique for low-frequency MEMS energy harvesters.
-
Aunet, Snorre & Aunet, Snorre
(2021).
Ultra Low Voltage Sub-100 mV Vdd CMOS.
-
Zadeh, Somayeh Hossein; Ytterdal, Trond & Aunet, Snorre
(2021).
Subthreshold Power PC and Nand Race-Free Flip-Flops in Frequency Divider Applications.
-
Aunet, Snorre
(2021).
Ørsmå energihøstende IoT-noder.
-
Zadeh, Somayeh Hossein; Ytterdal, Trond & Aunet, Snorre
(2020).
0 Comparative Study of Single, Regular and Flip well Subthreshold SRAMs in 22 nm FDSOI Technology.
-
Zadeh, Somayeh Hossein; Ytterdal, Trond & Aunet, Snorre
(2020).
0 Multi-threshold voltage and dynamic body biasing techniques for energy efficient ultra low voltage subthreshold adders.
-
Zadeh, Somayeh Hossein; Ytterdal, Trond & Aunet, Snorre
(2020).
An ultra low voltage subthreshold standard cell based memories for IoT applications.
-
Seyedi, Azam; Aunet, Snorre & Kjeldsberg, Per Gunnar
(2019).
Towards Compact Radiation Hardened Memories for Space Applications.
-
Hossein Zadeh, Somayeh; Ytterdal, Trond & Aunet, Snorre
(2019).
Exploring optimal back bias voltages for ultra low voltage CMOS digital Circuits in 22 nm FDSOI Technology.
-
Hossein Zadeh, Somayeh; Ytterdal, Trond & Aunet, Snorre
(2019).
Ultra-Low Voltage Subthreshold Binary Adder Architectures for IoT Applications: Ripple Carry Adder or Kogge Stone Adder.
-
Zadeh, Somayeh Hossein; Ytterdal, Trond & Aunet, Snorre
(2019).
Ultra low voltage subthreshold Binary Adder
Architectures for IoT Applications: Ripple Carry
Adder or Kogge Stone Adder.
-
Zadeh, Somayeh Hossein; Ytterdal, Trond & Aunet, Snorre
(2019).
Low Energy CMOS building blocks for IoT.
-
Zadeh, Somayeh Hossein; Ytterdal, Trond & Aunet, Snorre
(2019).
Low Energy CMOS building blocks for IoT.
-
Aunet, Snorre
(2019).
Asynchronous ultra low voltage / low power CMOS - what and why, but not much about how. .
Show summary
Reducing the supply voltage to the subthreshold region, where the supply voltage is below the absolute values of the inherent threshold voltages, has been regarded the most direct and dramatic means of reducing overall power consumption.
Full Custom logic and memory building blocks are needed to take full advantage of subthreshold operation, that may reduce power consumption by orders of magnitude, and energy per operation by typically 10 times or more, compared to normal super-threshold circuits. The lower the supply voltage, the lower the power consumption.
Examples of a 32-bit RISC processor operating at subthreshold voltages, as well as arithmetic building blocks operating at supply voltages below 100 mV are mentioned briefly.
The strong relationships between supply voltage and power consumption, as well as energy per operation, are due to several exponential relationships between voltages across nodes of the transistor, temperature and threshold voltages, in subthreshold operation.
Dealing with process-, voltage-, and temperature (“PVT”-) variations is the greatest challenge when exploiting subthreshold techniques for ultra low power / low energy circuits.
Circuit delays can vary by up to orders of magnitude, due to PVT variations. When the traditional synchronous circuits design paradigm is used, combined with subthreshold operation, necessary worst-case timing considerations leads to slowing down circuitry by up to orders of magntitude, while at the same time increasing power- and energy consumption radically, compared to superthreshold / strong inversion circuits.
To fully exploit the low power / low energy potential of subthreshold circuits, asynchronous solutions not taking worst case behavior into account in the normal fashion, may be a key concept.
-
Aunet, Snorre
(2018).
Possibilities with ultra low power / low energy integrated circuits.
-
Vatanjou, Ali Asghar; Ytterdal, Trond & Aunet, Snorre
(2017).
Ultra-Low Voltage/Energy CMOS Building Blocks in 28 nm FDSOI Technology.
-
-
-
Aunet, Snorre
(2017).
Introduction to ultra-low power electronic circuits design.
-
Aunet, Snorre
(2017).
Introduction to ultra-low power electronic circuits design.
-
-
Vatanjou, Ali Asghar; Låte, Even; Ytterdal, Trond & Aunet, Snorre
(2016).
Ultra-Low Voltage Adders in 28 nm FDSOI Exploring Poly-Biasing for Device Sizing.
-
Vatanjou, Ali Asghar; Ytterdal, Trond & Aunet, Snorre
(2016).
28 nm UTBB-FDSOI energy efficient and variation tolerant custom digital-cell library with application to a subthreshold MAC block.
-
-
Bertelsen, Patric Andre; Marchuk, Vitaly; Vestli, Snorre; Waagen, Johannes & Aunet, Snorre
(2024).
Optimizing Routing Architectures for Small-Scale Heterogeneous Systems.
Norges Teknisk Naturvitenskapelige Universitet.
-
Sollien Øfsti, Gjermund; Hasanbegovic, Amir & Aunet, Snorre
(2024).
NIRCA MkII DevKit Firmware Development.
Norges Teknisk Naturvitenskapelige Universitet.
-
Akre, Erlend; Aunet, Snorre; Wisland, Dag Trygve Eckhoff & Kjelgård, Kristian Gjertsen
(2023).
Ultra Low Power Digital Oscillator with Dynamic Leakage Suppression and compensation In 65nm BULK CMOS and 22nm FDSOI.
Universitetet i Oslo, Fysisk Institutt.
-
Jenab Mahabadi, Zahra & Aunet, Snorre
(2023).
High-Level Power Modelling for IoT.
Norges Teknisk Naturvitenskapelige Universitet.
-
Laurentui, Erhan & Aunet, Snorre
(2023).
Air Flow Sensor - Wireless and Battery Powered.
Norges Teknisk Naturvitenskapelige Universitet.
-
Abdoliniafard, Erfan; Shakouri, Shima; Qadir, Omer; Dieset, Herman Kristian & Aunet, Snorre
(2023).
Design and Analysis of a Voltage Controller for
Memristor-based In-memory Computation.
Norges Teknisk Naturvitenskapelige Universitet.
-
Sedal Slagstad, Karsten; Marchuk, Vitaly & Aunet, Snorre
(2023).
Optimal Implementation of
Configurable Logic Framework in
AVR Microcontrollers.
Norges Teknisk Naturvitenskapelige Universitet.
-
Chen, Yini; Aunet, Snorre & Wisland, Dag Trygve Eckhoff
(2023).
Frequency Σ ∆ Modulator with ultra-low power supplies towards sub-100 mV.
Universitetet i Oslo, Fysisk Institutt.
-
Moen, Ole-Tobias; Hasanbegovic, Amir & Aunet, Snorre
(2023).
Camera Link transmitter for high bandwidth ADC data readout.
Norges Teknisk Naturvitenskapelige Universitet.
-
Skirbekk, Asta; Ytterdal, Trond & Aunet, Snorre
(2023).
Design of SRAM for Sub-100mV Operation Using 22 nm FD-SOI.
Norges Teknisk Naturvitenskapelige Universitet.
Show summary
Energy harvesting is a promising solution for Internet of Things (IoT) devices, as this removes
the need for frequent changes of batteries. Many energy harvesting solutions struggle to supply
a high voltage, and this provides a problem for on-chip memory which is often volatile and
therefore requires a reliable power supply at all times. On-chip memory must therefore be
designed to work at ultra low supply voltages.
The objective of this master’s thesis has been to use a 22 nm FD-SOI (Fully Depleted Silicon
On Insulator) transistor technology to create a custom Static Random Access Memory (SRAM)
for sub-100mV operation and to study how the minimum supply voltage is affected by process
variation and transistor mismatch. To achieve this one must also carefully design and study the
SRAM’s subcircuits, and this has therefore been a major part of this project. As minimising the
supply voltage has been the aim, this has been done even when it is at the cost of higher power
consumption and/or an increased chip area compared to SRAM circuits operating at higher
supply voltages. The SRAM was designed to operate at temperatures in the range 0◦C to 50◦C,
as this would allow it to be used in most indoor applications as well as in medical applications.
Physical layouts were created for a 4B SRAM, 16B SRAM, and a 64B SRAM, as well as for
all the SRAM’s subcircuits, to get more reliable and accurate simulation results. The three
SRAM layouts were found to operate at a minimum supply voltage of 85 mV when process and
temperature variations were considered. The SF corner had the worst post layout performance
for all circuits, and it was concluded that better balancing of the PMOS and NMOS transistors
would improve the performance in this corner considerably. This improvement can be done by
changing the transistor sizing strategy slightly as well as switching from merged to non-merged
transistors in the layout.
Monte Carlo simulations of transistor mismatch were run on the 4B SRAM and the 16B SRAM,
and good post layout yields were achieved for a supply voltage of 80 mV with yield4BSRAM,80mV =
97% and yield16BSRAM,80mV = 97.6%. The performance of the SRAM’s subcircuits indicate
that the yield will remain high for larger SRAM circuits as well.
-
-
Sundsby Overholt, Ingvild Una; Lopez Tello Villafuerte, David Luis & Aunet, Snorre
(2023).
High Level Power Modelling for IoT.
Norges Teknisk Naturvitenskapelige Universitet.
-
Amble, Magnus; Kjelgård, Kristian Gjertsen; Wisland, Dag Trygve Eckhoff & Aunet, Snorre
(2023).
Cryptographic hardware resistant to Leakage-Based Differential Power Analysis Attacks.
Universitetet i Oslo.
Show summary
As cryptographic encryption algorithms have progressed in complexity, adversaries have looked
to other means of accessing encrypted secret data than traditional cryptanalysis. One such
alternative method of intrusion is through side-channel analysis, where physical properties of a
cryptographic circuit are analyzed to reveal sensitive data.
This thesis presents analyses of countermeasures against side-channel analysis based on variations
in the power consumption of a circuit. A trend seen in the advancement of Complementary
Metal–Oxide–Semiconductor(CMOS) technology is an increase in static power consumption. In
the context of cryptographic circuits, this aspect of the power consumption has proven to contain
information about sensitive data processed by the circuit, making it a potential attack vector for
an adversary.
Three gate-level countermeasures against leakage power analysis attacks are proposed in this thesis.
Two of which are logic styles based on the equalization of a logic gate’s power consumption between
input combinations, coined Octuple logic and Quadruple Dual-Pulldown Logic (QDPL). Both
of these countermeasures expand on the concept of a countermeasure from the literature called
Exhaustive Logic Balancing (ELB). The third proposed countermeasure is a current-masking
scheme that introduces input-independent noise in the power consumption through periodically
randomizing the voltage on the bulk-terminals of the logic gate. Combining such a logic style
with the bulk-voltage masking scheme lowers the possible signal-to-noise ratio obtainable by
an adversary by both adding additional noise to, and by lowering the data-dependency in, a
circuits power consumption. Preliminary experiments showed that the masking scheme is able to
modulate the static current of a ELB NAND gate within an interval where the highest current
is six times larger than the lowest current. Subsequently, rough estimates show that if signal
averaging analyses are conducted on a register implemented in QDPL, the mask could make
the analyses require over 800000 additional signal measurements, compared to when no mask is
applied, to find a one-bit hamming weight difference in the register’s static current.
-
Bogevik, Olav; Qadir, Omer & Aunet, Snorre
(2022).
Evaluate potential of approximate multipliers for energy efficient inference in DNNs.
Norges Teknisk Naturvitenskapelige Universitet.
-
Sæther, Harald; Ytterdal, Trond & Aunet, Snorre
(2022).
A Sub-100mV Supply Voltage Standard-Cell Based Memory in 22nm FD-SOI.
Norges Teknisk Naturvitenskapelige Universitet.
Show summary
The desire for reduction in power consumption has motivated the design of integrated circuits operating in the sub-threshold domain. Circuits operating at subthreshold supply voltages need robust architectures and techniques, that can withstand effects that are pronounced in the sub-threshold domain. Effects such as an
increased sensitivity towards process variation and a diminished on-to-off current
ratio can negatively affect the circuit functionality. Many applications such as energy harvesting in mm-scale nodes and biomedical devices, desire reliable and
efficient integrated circuits that have as low of a supply voltage as possible. Currently, charge pumps/voltage converters, that have bad efficiency, need to be used
to convert the output voltage of an energy harvester into a higher voltage, which
the rest of the circuit runs on. This costly conversion can be reduced or avoided if
the rest of the circuit runs at as low of a supply voltage as possible. Memory is often used in complex digital circuits like mm-scale nodes and biomedical implants,
occupying a vast amount of circuit area, and is thus a component where huge
power savings can be gained if memory is designed at as low of a supply voltage
as possible. In this thesis, such a memory is designed for operation at sub-100mV
supply voltage. Where robust techniques are applied at the transistor to architectural abstraction level of the memory. Several classic static logic gates are benchmarked in terms of power, performance, layout area, against so called Schmitt
Trigger-structures that have been proven to operate at very low supply voltages.
From standard cells(NAND, NOR and NOT -logic gates) a simple standard cellbased memory(SCM) is constructed which includes structures such as multiplexers, decoders, pre-decoders, clock gates and data-flip-flops(DFF). The constructed
memory is then simulated and modelled to verify functional chip yield for a supply
voltage of 87 mV. The complete SCM which is 1024-bits in size, with 128 different addresses, storing 8-bits of data at each address, shows good functional chip
yield, with the lower bound of yield above 90% with a maximum redundancy of 4.
Operating frequency is 150Hz, with an average power consumption of 6.991nW.
The SCMs total layout area is 338741.1um2
.
-
-
Hossein Zadeh, Somayeh; Ytterdal, Trond & Aunet, Snorre
(2022).
Energy Efficient Subthreshold Digital Building Blocks
.
NTNU.
ISSN 1503-8181.
Show summary
Many IoT applications such as implantable biomedical devices, sensor nodes in the internet of things operate in the kHz range, and power consumption is the primary concern in such applications. However, the required voltage of the most implantable electronic devices is 2-3 V [47]. The output properties of the most recent in vivo energy harvesters (IVEHs) is 150 mV and below [47, 61] which could suit the low voltages for the subthreshold circuits, while
saving energy by not having to use as energy costly DC-DC conversion as one would for higher supply voltages. Therefore, subthreshold circuits operating at the supply voltages lower than the absolute value of the threshold voltage of the transistors might be the best option for such applications. The power consumption is reduced as the circuit supply voltage is lowered down towards and below the threshold voltage of the transistors, but it will increase the propagation delays. It may not be a concern for low to medium performance. Voltage scaling in integrated circuits brings challenges for a designer that has to be considered during the design phase. The impact of the process,
voltage, and temperature variations increases by voltage scaling and affects the functionality of the circuits.
This thesis focuses on designing and exploring energy efficient computing and memory circuits at ultra low voltage subthreshold regime at the different abstraction levels.
Techniques such as body biasing (reverse body bias), transistor stacking, device sizing, multi-threshold voltage devices at the gate level have been explored to reduce the power consumption especially static power, taking into
account the reliability issue and process, voltage and temperate variations. At the circuit level, different topologies of the full adders based on the standard CMOS designed for subthreshold supply voltages have been compared considering the functionality and reliability issues. In addition, an optimal back gate bias has been proposed in a commercially available 22 nm FDSOI (Fully Depleted Silicon On Insulator) technology that minimizes the energy per operation consumption of subthreshold digital CMOS circuits
and improves the reliability. The adder as a case study under optimal body bias consumes 4.67 percent less energy than zero body bias at Vdd=150 mV and a frequency of 1 kHz.
At the architectural level, two different types of adders including Kogge iii
Abstract Stone adder (KSA), the fastest adder, and the Ripple Carry adder (RCA), the simplest adder have been designed and fabricated for supply voltages as low as Vdd = 140 mV. The adders have been synthesized at the gate
level using full custom standard cell library designed for ultra low voltage subthreshold regime.
The gap between simulation and measurement results is filled with successful implementation and comparison of the ultra low subthreshold adders at such a low voltage 140 mV. To the best of the authors knowledge this is the first measurement comparison between two different adder architectures for ultra low supply voltages as low as 140 mV. Simulated results in [7] indicated that the RCA is 1.36X energy efficient compared to the KSA at the same speed. Measured results presented here, show that the RCA is 4.15X to 1.92X energy efficient compared to the KSA at supply voltages between 250 to 500 mV. In addition, the RCA designed in
this study outperforms the reported works in terms of a defined FoM which is (T ech)/(Vmin.Energymin). Digital circuits designed for applications like sensor networks, implantable biomedical devices and environmental monitoring need to work at different conditions. For example, the temperature range that circuit should work. In this thesis, we have studied the performance of the circuits at different
temperatures, supply voltage and in the presence of mismatch and process variations. <further contents not included due to space limitations>
-
Sarker, Anjan & Aunet, Snorre
(2021).
Analysis, comparison and implementation of open-source Floating Point Units for energy harvesting applications.
Norges Teknisk Naturvitenskapelige Universitet.
-
Steinsland, Christian Rosioara; Aunet, Snorre & Ytterdal, Trond
(2021).
Design and Implementation of a Digital Standard Cell Library for 28 nm Technology.
NTNU.
Show summary
A digital standard cell library has been designed and implemented for a 28 nm technology. The library
has been designed and optimized for a supply voltage of 300 mV, to be compatible with a standard
design flow. Each cell has been characterized with extracted parasitic components. Combinatorial logic
gates, including compound logic gates, and sequential cells were implemented with SLVT (Super Low
VT) transistors. The library has been used to synthesize a functional RISC-V architecture (PicoRV32).
The motivation was to verify the functionality of the standard cell library and obtain quantitative results
of the performance of the library. The minimum energy point (at room temperature in the TT-corner) for
the CPU was found to be with a supply voltage of 500 mV and a frequency of 20 MHz. By increasing
the supply voltage to 600 mV, the CPU supports a 50 MHz clock. The highest simulated frequency was
250 MHz at 1 V
-
Rud, Markus; Moldsvor, Øystein & Aunet, Snorre
(2021).
Power and energy consumption in hardware implemented SPI master devices.
NTNU.
Show summary
This thesis presents an analysis of two different VHDL designs of the SPI master device implemented onto a model of a FPGA. The analysis is focused at power and energy consumption in the devices compared with provided functionality. The two devices differ in their design strategy where one is created as a simple design where only the required logic to conduct a SPI transmission are implemented. The second one is a more complex design where it is possible to adjust transmission parameters such as setup and hold time after implementation and a more complex interface to the controlling logic which controls the SPI masters. The complex implementation also implement two FIFO registers to store multiple messages during transmission and reception.
The conducted analysis is based upon different tests in order to give an understanding of which elements of a SPI master who impact the energy consumption in the device. These tests look into the impact of operating frequency, communication frequency, operation mode and alternation to the utilized logic. The designs are implemented onto a model of a FPGA using the development tool Vivado. The two designs are also power optimized using the build in power optimizer in Vivado.
The results from the analysis show that when implementing a SPI master, it is necessary with a trade of between functionality and energy consumption. The different implementations are analysed over a frequency span of 1 MHz to 15 MHz where it is seen that the complex master requires 27.2% more energy than the simple master on average. It is therefore seen that a higher complexity in the design requires more energy. The complex master utilize more than twice as much logic, but not twice as much energy, so the energy cost of added functionality is therefore heavily dependent on the switching activity in the added logic. The results also show that it is preferable to operate the tested SPI masters at the highest frequency possible within the tested frequencies since this gives the lowest energy consumption. This result is to some extend limited by the implementation method as the implementation of the SPI master onto a FPGA removes some potential benefits of operating the design at a low frequency such as smaller transistor sizes and lower operating voltage. It is also seen that the two SPI designs react relatively similar to adjustments to transmission parameters such as communication frequency and operating mode since their percentage energy difference with adjustments are approximately similar for both designs.
The analysis consist of some limitations. The SPI masters are analysed as standalone devices not connected to any controlling device which limits the energy analysis due to missing timing delays. The SPI masters are also relatively small designs so when implemented on a large FPGA compared to the designs, a large static power overhead is added which can hide the actual static power consumption for the designs themselves.
-
Nielsen, Auseth, Øivind; Aunet, Snorre & Ness, Torbjørn Viem
(2021).
Modelling of Cache/Interconnect Performance in an Embedded System.
NTNU.
Show summary
The cache and bus topology in an embedded system has a big influence on the system performance and energy efficiency. Producing a new silicon chip for testing is expensive, and this has made simulating architectural changes a common practice in the industry. Modern RTL simulations are able to provide highly accurate estimates of integrated circuit performance. However, modelling an architecture for such a simulation is a long and laborious process, and making modifications to the model is often time consuming. There is a need for a quick and easy way of experimenting with different bus topologies, which is still able to provide a good estimate of how various changes will affect performance.
This thesis presents an easy to use model, which allows for completely changing an architecture just by modifying a few values in a text file. The model uses a node tree representation of the bus hierarchy, abstracting away hard to model architecture features, such as timing jitter when crossing between clock domains, and bus contention. Each node represent a component in the interconnect topology and explicitly states the latency it contributes to a memory access. This allows a highly simplified architecture description to be written, only focusing on the aspects of the bus topology which significantly contribute to memory access performance. The bus topology simulated for testing purposes in this thesis is the main Cortex-M33 processor on Nordic Semiconductor's nRF5340 SoC.
The simulation results were compared to tests running on a nRF5340 development kit. When simulating Coremark with cache enabled, the model reported a 18.70\% longer run time than hardware. When simulating Coremark with cache disabled, the model reported a 4.36\% longer run time than hardware. When simulating sequential accesses to matrices, the model reported 68\% more instruction cache hits and 18\% fewer latency cycles than hardware for the smallest matrices, and 50\% more instruction cache hits and 5\% fewer latency cycles than hardware for the largest matrices. The model displayed high fidelity, but the simulated results were offset from the hardware results. It is useful for predicting whether a change would lead to an increase or decrease in instruction cache performance and total latency cycles. There were issues outside the scope of this thesis which were significant error sources in the results. In preliminary testing of how the model would perform if these issues were resolved, it reported cache hit and cache miss results which were within 0.2\% of the results seen on hardware, and cache latency results which were 7.3\% higher than the results seen on hardware. This shows that the model has the potential to achieve very accurate results, and be a useful tool for initial exploration of the performance impact of modifications to cache and bus topology.
-
Skavnes, Solveig; Moldsvor, Øystein; Hagen, Anders Ivar & Aunet, Snorre
(2021).
Design and Implementation of a Magnetic Energy Harvesting System for Low Primary Current Applications.
NTNU.
Show summary
Energy harvesting is the act of exploiting small amounts of ambient energy to power a low power system, like a sensor node. This can be done from the magnetic fields originating from an AC current, Ip, flowing through a power cable. The aim of this thesis is a system that is able to deliver 3.3 V to charge a battery, while Ip is under 1 A. The system should work for Ips under 1 A, since this will allow it to charge a battery a lot of the time even if the cable it is connected to is not carrying high AC currents. A magnetic energy harvesting system can consist of a current transformer (CT), a rectifier and a DC/DC converter. Such a system is designed, simulated using SPICE and implemented using components on a breadboard. The different sub-systems are tested individually and together. Five different commercial CTs are tested. Three different rectifiers are designed and tested, and two DC/DC converters are tested. This makes a total of 30 sub-system configurations, of which three fulfil the requirement of working at Ip < 1 A. The combination of sub-systems that manages to deliver 3.3 V to a battery on the lowest Ip is the combination of a CT with a 1:2000 turn ratio, a schottky diode rectifier and a high input resistance DC/DC converter. This system delivers 3.3 V to an attached battery at Ip = 0.5 A. It is tested and verified that the three sub-system configurations that work at Ip < 1 A continue to work up to Ip = 16 A. A commercial energy harvesting system with a combined rectifier and DC/DC is tested with the five CTs under the same conditions as a comparison. This is the LTC3331, and together with the best performing CT, it requires Ip = 5 A to deliver 3.3 V to a battery, making the commercial solution significantly worse than the system designed in this thesis.
-
Rotevatn, Synnøve Andersen; Barzic, Ronan & Aunet, Snorre
(2021).
Automated Desynchronization Using Pyverilog.
NTNU.
Show summary
Asynchronous circuits are inherently more robust to delay variability than the more common synchronous circuits. Automated desynchronization allows for making asynchronous circuits by removing the global clock and replacing it with handshake circuitry, without changing the design flow. The changes in the Verilog code needed to desynchronize a synchronous circuit into a simple Muller pipeline are presented. An implementation in Python using Pyverilog is made to automate that process. It is tested on two small designs. One is a linear pipeline consisting of just registers, and the other is a non-linear pipeline with a fork, a join and combinational logic. The implementation succeeds in transforming the designs into asynchronous circuits, and both pipelines pass tests that are comparable to the tests of the original circuits. However, the simplicity of the type of asynchronous circuit that is chosen means that the functionality is not equivalent to the original. Lastly, the implementation is tested on a RISC-V CPU core. All the same changes are made as with the smaller designs, but the solution is not advanced enough to handle such a complex circuit. A number of limitations to the current implementation are identified, which would need to be improved to have useful, automated desynchronization. The method is not optimal for all cases, but can be the most useful for someone who has full control of the coding style of the circuit they are desynchronizing.
-
Aunet, Snorre; Ytterdal, Trond; Lande, Tor Sverre & Moulin, Kaspar Sigurd
(2021).
Low-Power Neuromorphic Sound Source Localization.
NTNU.
Show summary
In this thesis, we present a neuromorphic approach to the sound source localization problem, based on the Jeffress model of sound source localization. The proposed system uses binary neural spike coding and asynchronous axonal delay lines to efficiently compute the cross-correlation in the time domain and determine the interaural time delay (ITD) between binaural input signals. While more research is needed, the proposed approach potentially offers significant improvements over current hardware implementations, with regards to power consumption and response time. In the test cases simulated, the system estimates the ITD to within 10us, with an average power consumption of 350nW. Simulations also showed a sub-millisecond reaction time to changes in the ITD. However, the proposed system is highly dependent on accurate component matching for proper functioning, and we find that additional compensation techniques would be required for a fabricated chip.
-
-
Ramberg Møklegård, Benjamin & Aunet, Snorre
(2020).
AI on device Visual Occupancy Detection.
Norges Teknisk Naturvitenskapelige Universitet.
-
Steen Fosse, Carl Richard; Tjora, Sigve & Aunet, Snorre
(2020).
Power consumption in relation to communication protocols used by a Nrf9160.
Norges Teknisk Naturvitenskapelige Universitet.
-
Schoepe, Felix Allan; Gamst Reichelt, Pål Øyvind & Aunet, Snorre
(2020).
Ultra-low power accurate temperature
sensor for IoT.
NTNU.
Show summary
The analog front-end of a proposed BJT-based temperature sensor has been designed. The design has been analysed using Monte-Carlo simulations with mismatch over all process
corners. The design was implemented in a 90nm generic process design kit. The analog front-end achieves ultra-low power consumption with a current consumption of 2.3 mA at a temperature of 27 °C and can operate over the military temperature range of -55 °C - 125 °C. It uses a voltage supply of 2 V. An adaptive self-biasing operational amplifier was implemented in the bandgap reference
circuit to ensure sufficiently high DC loop-gain. It operates on an ultra-low current consumption
of 631 nA. To reduce errors due to mismatch and process spread two correction techniques have been employed, that is chopping of the input signals of the operational
amplifier and dynamic element matching of the current sources in the bipolar core.
The residual temperature reading error at the output of the analog front-end is large and
results in significant errors. This is because no compensation technique was implemented
in the analog front-end to compensate for process spread of the BJTs and should be implemented
digitally. The temperature reading errors due to offset and mismatch in the
current sources have been reduced to around 0.03 °C at a temperature of 27 °C. The noise
in the circuit was analysed without the dynamic effects of the DEM because of its simple
implementation and results in an equivalent temperature error of around 0.12 °C, which indicates the resolution that is achievable. The settling time of the circuit is in the range of 110 ms.
-
Bygland, Embla Trasti; Austbø, Knut & Aunet, Snorre
(2020).
Power Modeling of Complex Designs.
NTNU.
Show summary
In this project, a tool for making power models of designs at the Register Transfer Level
(RTL) is implemented. The generated power model is intended to be used with a power
estimation tool, to give an early, fast and accurate power estimate. Nordic Semiconductor
ASA issues this masters project with the motivation of making RTL simulations poweraware. Discovering power bugs early in the implementation of a design may save iterations
in the Application Specific Integrated Circuit (ASIC) design flow, and thus reduce time to
market for a product.
The method for estimating power at the RTL called the top-down method was chosen for
the implementation. Among other desired qualities, it does not require a gate-level representation of the design to produce a power estimate. This allows for power estimation to
be done concurrently to simulations for functional verification of the RTL, before synthesis
of the design.
The power modeling problem is divided into three tasks:
1. Extracting structural information from an elaborated SystemVerilog representation
of the design.
2. Extracting information about available cells and their power consumption characteristics from the cell library.
3. Combining the structural representation with the cell- and power information retrieved, in order to create a power model.
In the implementation, the structure of the design is represented by a node tree, while a cell
library object was created to represent available cells from the cell library and their power
data. In order to produce a power model, the implementation takes sequences of generic
cells from the structure tree and replace them with cells obtained from the cell library. The
power model consists of several power-aware node trees. The power model representation
is more similar to the gate-level netlist than the elaborated SystemVerilog representation.
However, more work is needed to obtain a proper comparison between them.
The implementation shows promise for accurate and fast power estimation. Several abiii
stractions are done in the process so that fast estimations can be made, and their effect
on the power consumption have been evaluated together with other alternatives. When
creating the power-aware node tree, cells from the generic cell library are grouped to more
complex cells from the cell library. This grouping ensures a reduction in the number of
cells, which brings the model closer to the gate-level representation.
Some work remains to complete the power model; the most complex generic cells from
the elaborated SystemVerilog file need to be constructed from several cells from the cell
library. Complex cells with no equivalent yet are those representing arithmetic operations,
shifters and comparators. When these cells have a representation, switching activity can
be propagated through the structure trees in order to get a power consumption estimate
for each of them. The final job of the power estimation tool is to solely use the activity
data from the RTL simulation, together with the power values from each structure tree to
yield the power estimate.
-
-
Gausdal, Kaja; Gonsholt, Kyrre Erlend; Qadir, Omer & Aunet, Snorre
(2020).
Applying Asynchronous Completion Detection to the AVS Domain.
NTNU.
Show summary
Power consumption has become one of the leading challenges designers face as technology continues to scale. Classical worst-case corner analysis is overly pessimistic, since increasing PVTA variations must be compensated for by excessive safety voltage margins. Certain AVS approaches minimize power consumption by monitoring the circuit's timing with error detectors and adapting the supply voltage to the chip's actual operating condition, consequently eliminating the need for excess voltage margins.
Asynchronous completion detectors share many similarities with AVS error detectors. This thesis aims to evaluate the feasibility of applying a selection of bundled-data and dual-rail completion detectors to the AVS domain. Through a qualitative analysis, the asynchronous dual-rail scheme was deemed most suitable for adaptation due to its robustness towards PVTA variations. A novel dual-rail AVS approach was proposed, aiming to reduce unnecessary voltage margins by implementing a QDI dual-rail error detector. A dual-rail RTL library was constructed and used to map a single-rail MAC into a dual-rail function block. Sixteen function block variants were implemented and compared to a single-rail reference. The functionality of the error detector was verified through simulation, and results show the power, area, and timing overhead of the error detector to be either relatively constant or converging towards a lower bound for higher input size. The area and power overhead of the error detector are considered manageable and not a diminishing factor. Through tighter timing constraints, the timing overhead of the error detector critical path was minimized. A dual-rail technology library was proposed as a way to minimize area, power, and timing overhead. The synthesized error detector showed violations of QDI constraints, which compromise the robustness of the system. Actions were suggested to correct the violations, but due to the lack of appropriate tools, the overall robustness of the system was hard to quantify. The implemented dual-rail error detector was concluded to show promise as a proof of concept, and directions for future work were given.
-
Fini, Simone; Ytterdal, Trond; Aunet, Snorre & Lavagno, Luciano
(2019).
Sub-Threshold Design of Arithmetic Circuits: when Serial might overcome Parallel Architectures.
NTNU.
Show summary
Adder circuits are vital for microprocessors; indeed, apart from the
addition itself, either subtraction, multiplication or division algorithms
may require, at a certain point, the addition of two (partial or not)
operands. For this reason, several architectures have been studied and
improved over the last decades, in order to speed up the aforementioned
operation.
On the other hand, it is well known that having faster circuits means
higher complexity, and, therefore, higher power consumption. In addition
to this, the downscaling process of transistors has increased the
leakage current of these devices, accounting for up to 33% of the total
dissipation, and due to the little capabilities of batteries with
respect to the achievable performance of circuits, the main challenge
of engineers and designers is represented by exploiting low power techniques
so as to decrease the power consumption of electronic devices as
much as possible.
This Master’s Thesis work wants to demonstrate that, when working in
sub-threshold region, it might be possible to employ simple and repetitive
circuits, like ripple carry adders, instead of complex ones, such as
Kogge-Stone architectures, having the same propagation time but with a
significantly lower energy consumption. In this way, it would be possible
to have, at the same time, the performance given by a fast adder and
the area and energy dissipation of the simpler and weaker "anchestor".
As an anticipation, and as it will be seen in the final results, the technology
employed and the choice of the best available architecture resulted
in a great improvement with respect to the study previously conducted.
First of all, with the development of a new full adder circuit (the so called
"XMAJ3"), it is possible to reduce the energy consumption with respect
to ripple carry adders based on both already existing architectures and
on the full adder cell contained in the library. This even without the
employment of customized gates, but only with standard logic blocks
already contained in the library.
Secondly, FDSOI technology makes possible to equalize performance of
serial and parallel adders and, at the same time, saving energy, even
in super-threshold region, allowing to avoid all the problems that subthreshold
design brings. Particularly, for 32-bit based devices, the average
energy saving with respect to the Kogge-Stone adder accounts for
41.48% (with a peak of 56.16%), while for 64-bit adders the mean saving
is 50.02%, with a maximum of 56.83%.
-
Boland, Connor; Aunet, Snorre & Moldsvor, Øystein
(2019).
Low Power Environmental Air Quality Monitoring.
NTNU.
Show summary
Measuring personal environmental air quality is becoming increasingly relevant in today's society. Traditional monitoring requires oversized, expensive instruments, with slow and intensive sampling methodologies. Scalable, low-cost monitors are required to replace these outdated designs, to increase global access to air quality data.
Disruptive Technologies provides long lifetime IoT solutions, specialising in smart wireless sensors. The aim of this project is therefore to develop a battery-operated environmental air quality monitor, with lifetime standards in line with those at Disruptive Technologies. The monitor must not only be able to measure a number of harmful air contaminants but also to measure the quality of the working environment. Specific emphasis on particulate matter sensing will provide a monitor with commercial viability.
A number of particulate matter sensors were introduced. Initial current measurements on each were conducted to test their individual feasibility as part of a complete battery-powered monitor. Out of all of the tested sensors, only the Sharp GP2Y10AU0F compact optical dust sensor showed low current consumption for battery operation. A number of techniques to reduce this power consumption were introduced as part of a test design with the Sharp sensor and a microcontroller. Overall the calculated energy requirement for the sensor came to 1 638J for a two year lifetime. This was combined with two other sensors (one sensor measuring both volatile organic chemicals (VOC) and carbon dioxide (CO2), and the other sensor measuring both temperature and humidity) as a model of the complete energy requirements for the monitor. The total calculated energy consumption came to 125 968J, far exceeding a standard battery capacity of up to 39 960J. The unforeseen power limitation of this design was with the chosen VOC sensor, however, the project has still shown feasible opportunities for a battery powered environmental air quality monitor capable of measuring particulate matter.
-
Christensen, Steinar Thune; Aunet, Snorre & Qadir, Omer
(2019).
En Konfigurerbar og Fleksibel Arkitektur for Laveffekt, Energieffektiv Maskinvareakselerasjon av Nevrale Nettverk basert på Foldning.
NTNU.
Show summary
Convolutional neural networks (CNNs) have become paramount in today’s Artificial Intelligence (AI) and Machine Learning applications. This is true for image recognition in particular. This thesis presents a configurable, versatile and flexible architecture for hardware acceleration of CNNs that is based on storing and accumulating the entire feature maps in local memory inside the accelerator. This has been done while aiming to be able to process any type of CNN while consuming as low power as possible and achieving the highest possible energy efficiency, which refers to the number of operations per unit energy (measured in Multiply-Accumulate operations per unit energy, MACs/s/W or MACs/J). Several different versions of the architecture have been synthesized and tested using different configurations. It performs well when compared to the state-of-the-art, achieving an improved energy efficiency of over a factor 5 for select CNN layers. The most efficient
configuration achieves 175 GMACs/s/W, while consuming 2.3 mW of power and occupying 585 KGEs (Kilo Gate Equivalents) of area at 1V supply voltage and a 100MHz clock. This is a significant improvement over Eyeriss [YuH17b] (a state-of-the-art accelerator) which has a maximal energy efficiency of 122.8 GMACs/s/W.
-
Paintsil, Wesley Ryan; Aunet, Snorre & Moldsvor, Øystein
(2019).
A comparative study for commercial TVOC sensors.
NTNU.
Show summary
Indoor air quality has become more important as human activity is increasingly spent inside. Internet of things has allowed for humans to better interact with the physical world by extracting information through sensor nodes. Indoor air quality sensors of different types have become increasingly popular, this thesis focuses on total volatile organic compound sensors. Three different sensors have been chosen and their operations evaluated by their power consumption. The Bosch BME680 proved to be the best sensor in terms of low-power, it had a lifetime of 4847 days as an alarm system and a lifetime of 2403 days if it acted as a monitoring system. The Integrated Devices Technologies ZMOD4410 only had a lifetime of 8 days. The AMS CCS811 had a lifetime of 871 days as an alarm system and 191 as a monitoring system.
The BME680 had an integrated humidity and temperature sensor which provided an advantage by letting the host controller sleep longer. It should be possible to drive the BME680 and the CCS811 sensor with a 230mAh battery for more than 2 years.
-
Vatanjou, Ali Asghar; Ytterdal, Trond & Aunet, Snorre
(2019).
Ultra Low-Power/Low-Energy CMOS Mixed-Signal Building Blocks.
NTNU.
ISSN 978-82-326-4322-6.
-
Choe, Ju Song; Gheorghe, Codin & Aunet, Snorre
(2018).
Test system design for a Photomultiplier
Readout Board.
NTNU.
Show summary
The S-DAM front-end board (S-DAM-FEB) is a photomultiplier readout module for charged
particles detection. This board has been designed for the readout of sensors in radiation
monitors by the company, Integrated Detector Electronics AS (IDEAS). A large amount of
S-DAM-FEB will be used in the neutron detector in the ESS (European Spallation Source)
in Sweden for scientific research. Thus, hundreds of these modules are planned to be manufactured by third party of EMS (Electronic Manufacturing Service) and each board has
to be validated after production. To be able to validate the DUTs efficiently, it has been
decided to create a test system.
In this thesis, the main focus was on the implementation of the test system to validate the functionality of the S-DAM-FEB. The project work included specifying the test
requirements, conceptualization of the test system, schematic design and PCB design of
needed hardware as well as implementing firmware and software for the test system. The
needed Python scripts were created to run the test and log the test results into a test report. The mechanics of the test system was modelled by drawing a 3D-model, and the
mechanical components were chosen according to the drawings.
After implementing all the modules to be used for the test system, these modules were
assembled together. The S-DAM-FEB was tested using the implemented test system. All
functionality of the DUT was tested, including power consumption and temperature as
well as gain, threshold and baseline for all channels. Minor fault in the DUT were found
by the test, indicating that some failure has occurred during the production process. All
the test results were logged into a test report for tracking the modules for future use and
determine the condition of DUTs after production.
The test system is in a working condition. Performing the validation test simple and
easy. The runtime of the test is decreased, and most of the manual work is replaced by
automatized process to get more reliable data and minimize possible human errors. The
implementation of the test system was successful, and a large amount of S-DAM-FEB are
ready to be validated.
-
Østerhus, Stian; Ytterdal, Trond & Aunet, Snorre
(2018).
Subthreshold CMOS Cell Library by 22 nm FDSOI Technology.
NTNU.
Show summary
Two different CMOS transistors with a low threshold voltage, given by a commercial available 22 nm FDSOI
CMOS technology were investigated and assembled into several libraries of logic gates. The logic gates
provided in the cell library should be sufficient to create most digital logic circuits, and are in addition
designed to work in the subthreshold region with a supply voltage of 350 mV. Physical layout designs
were made for the different digital ports, where parasitic capacitances were then extracted to provide more
realistic simulations and performance results. Compared to schematic simulation, layout design and parasitic
capacitances proved to reduce speed by a factor of 5 to 10, as well as increasing the transistors’ threshold
voltage by 14.6 % for the NMOS, and 32.5 % for the PMOS. The increased threshold voltage thus led to a
reduced static power consumption and increased switching energy.
The transistor with the lowest threshold voltage showed especially good performance results with respect to
low power consumption while still maintaining speed requirements. This transistor is throughout the report
referred to as mosfet low. Two cell libraries were made for this transistor, where one applies a forward
body-bias of ±2 V while the other have the bulk nodes connected to ground, which gives a 0 V body-bias.
The libraries are supplied with schematics and layout designs, and are in addition mapped for performance
data such as static power consumption, delay and switching energy consumption for every logic gate.
A minimum speed of 40 MHz with a lowest possible power consumption for a 16by12-bit adder, was the aim
of the project. Presented in this report is a 16by12-bit Adder built by Ripple-Carry Adders, which were
simulated to reach a speed of 44.26 MHz at a supply voltage of VDD=350 mV with 0 V body-bias. Static
power and switching energy consumption were simulated to 26.60 µW and 207.95 fJ, respectively.
-
Stubsjøen, Sivert; Moldsvor, Øystein & Aunet, Snorre
(2018).
Force measurement using a capacitive
sensor and a compressible material
.
NTNU.
Show summary
Disruptive Technologies are developing sensor solutions for the Internet of Things. Their current sensors can measure touch, temperature, and proximity. To expand the area of applications
their current sensors cover, new sensor solutions are examined. The one studied in this thesis
is a capacitive sensor measuring force. The idea is to place a compressible material on the
front of Disruptive Technologies capacitive proximity sensor and use it to measure force. A
compression of the material would lead to an increased capacitance measured.
This thesis covers the work of finding suitable materials and the practical measurements done
to characterize the capacitive sensor and the compressible material. Testing was done at two
different materials that had properties useful for the intended application. These tests revealed
that neither of the materials was optimal for a solution as described above. For different series of
measurements, the values measured by the sensor variated for the same applied load. This made
the work of creating a good fitting data model difficult. The proposed models could not predict
with high probability the values measured by the sensor for the various applied loads.
This lead to the conclusion that either the materials or the chosen sensor solution was not the
optimal one for measuring force. As a result of this, two other force sensing methods using the
same sensor is presented that can be further investigated in future work.
-
Paldas, Auritro; Barzic, Ronan & Aunet, Snorre
(2018).
Towards Predictable Placement of Standard Cells for Regularly Structured Designs.
NTNU.
Show summary
A lot of components in modern digital designs have very regular structures. Some examples
are Programmable Ring Oscillators, Time to Digital Converters and CPU register files. The
proper functioning of these components heavily depend on the way they are implemented
in the design with respect to the placement of standard cells. This is due to the fact that many
of these components are delay sensitive and the placement of cells in the layout affects the
delay. Standard place and route tools, however, do not always ensure that the placement of
standard cells is regularised, which can lead to sub-optimal results from these designs. The
work on this thesis is aimed towards ensuring a regular placement of standard cells for such
components, by developing a framework in a high-level language, from which the placement
information needed by the place and route tools can be obtained. This information, when
used by the tool, should result in a more predictable placement of standard cells, and should
thus result in more optimal behaviour of such components.
-
Rørstad Helle, Even; Moldsvor, Øystein; Hernes, Bjørnar & Aunet, Snorre
(2018).
Humidity Sensor.
NTNU.
-
Lesund, Martin; Tjora, Sigve & Aunet, Snorre
(2017).
Ultra-low power serial communication for Internet of Things.
NTNU.
-
Lid, Gunnar; Hagen, Anders; Blekken, Brage; Ytterdal, Trond & Aunet, Snorre
(2017).
Ultra-low power Design of DSRC modulator/demodulator in 28nm FD_SOI.
NTNU.
-
L'Orange, Simon; Hagen, Anders; Blekken, Brage; Ytterdal, Trond & Aunet, Snorre
(2017).
4-7Ghz Tunable Programmable Pulse Generator in 65nm CMOS.
NTNU.
-
-
Liknes, Kai Robert; Hernes, Bjørnar & Aunet, Snorre
(2016).
Ultra Low Leakage Memory.
NTNU.
Show summary
Three 64-byte memory systems were designed for a 0.18µm standard CMOS technology, one 6T-SRAM system and two D-Flip-Flop systems. The leakage current, read energy and write energy of these systems were determined by simulation. A set of extrapolation formulas for area, leakage current, read energy and write energy were designed to determine the characteristics of the systems as the size of the memory increases. The simulations showed that the 64-Byte 6T-SRAM system had a 39% lower area, an 83% lower leakage current, an 89% lower write energy and an 82% lower read energy than the reference D-Flip-Flop memory system. The extrapolation formulas predicted that as memory sizes increases, SRAM becomes more and more favorable in terms of area, leakage current, write energy and read energy.
-
Barua, Anomadarshi; Edwin, David & Aunet, Snorre
(2016).
Voice over mesh network.
NTNU.
-
Kvam Oma, Åsmund; Låte, Even; Vatanjou, Ali Asghar; Ytterdal, Trond & Aunet, Snorre
(2016).
Design of a near-threshold microcontroller.
NTNU.
Show summary
There is a strong interest in ultra low voltage digital design as emerging applications like Internet
of Things, wearable biomedical sensors, radio frequency identification, sensor networks and more
are gaining traction. This thesis describes the implementation, synthesis and testing of a microcontroller
using a near-threshold library. The system has been described in VHDL and synthesized for
near-threshold operation on 28 nm FDSOI production technology from STmicroelectronics. The
microcontroller implements a 32 bit RISC-V subset compatible pipelined processor and has SPI
connectivity. Two single port 2kB SRAM modules are used as RAM. A power gating technique
that reduces the static power in an ALU during runtime has been implemented and compared to a
traditional ALU. Traditional coarse grain power gating of the processor has also been implemented.
Using a supply voltage of 350 mV and a clock speed of 1 MHz the schematic SPICE simulation
reported an average power consumption of 4.42 mW during program execution. In power gated
mode the microcontroller consumed 2.98 mW. In a sensor logging program the average energy per
executed instruction was 4.91 pJ. Runtime power gating reduced the average energy consumption
of the ALU with 58 - 57% with a propagation delay penalty of 346 - 143% depending of the sizing
of the power gating transistors.