![hierarchical FPGA utilization map for the BIO implementation on a Digilent Arty (XC7A100T), total design: 41,757 cells. the BIO block (bio_apb, highlighted in magenta outline) consumes 14,597 cells... roughly 35% of the design. inside, four PicoRV32 cores (mach[0] through mach[3]) range from 1,698 to 1,937 leaf cells each... compact enough that all four together barely exceed a single PIO state machine's ~5,000 cell footprint. the host CPU (VexRiscvAxi4) sits at 8,017 cells including caches, with the remaining area eaten by AXI crossbars, bus adapters, and bridge logic. compare this to the PIO utilization map shown earlier in the post, where the PIO alone consumed 39,087 cells and dominated the floorplan... the BIO achieves a richer RV32E instruction set in less than 40% of the PIO's area. the visual tells the RISC-vs-CISC story immediately: four full CPU cores plus bus infrastructure, and you're still smaller than nine custom instructions with barrel shifters.](https://cdn-blog.adafruit.com/uploads/2026/03/bio-only-utilization.jpeg)
bunnie huang just dropped a deep-dive on the BIO, the I/O coprocessor he designed for the Baochip-1x, a mostly open source 22nm SoC, and it’s a banger!
tl;dr …. bunnie wanted something like the Raspberry Pi’s PIO but ran into problems. He built a full PIO clone for an FPGA and discovered it ate more silicon than the RISC-V CPU itself. The critical path was twice as slow too. Each PIO instruction tries to do everything at once, which means barrel shifters everywhere and combinational paths from hell.
So he went the other direction. The BIO uses four tiny RISC-V cores (PicoRV32, RV32E) with a trick from his PhD work… some registers in the file are actually queues with blocking semantics. So, when you try to read from an empty FIFO register the CPU halts until data shows up. Ditto when trying to write to a full one. There’s also a “snap to quantum” register that pauses execution until a clock tick, so you get deterministic timing without cycle-counting.
The result is half the area of a PIO, 4x the clock rate in ASIC, and you can write code in C (via Zig’s clang). bunnie’s post walks through DMA, SPI bitbang, and WS2812/NeoPixel LED examples. The whole LED demo with fixed-point math fits in 25% of one core’s 4 kiB memory.
“The PIO, while kind of neat as an abstract mental concept, really bugged me as an implementer. Barrel shifters are expensive in hardware.”
In classic bunnie fashion, the final implementation is open source & patent-free. If you want to play with it, the low-cost Dabao dev board is on Crowd Supply.
OK so what does this mean for normal humans?
If you’ve ever used a Raspberry Pi Pico, you might’ve heard of PIO. It’s the part that handles tricky timing stuff like talking to LEDs or SPI devices. bunnie built an inexpensive, open source, miniature version of that idea using standard RISC-V cores instead of a custom instruction set. It’s smaller, faster on silicon, you can program it in C instead of a specialized assembly language, and it’s completely open source. If you’re into hardware design or just want to see what’s possible when someone rethinks a problem from scratch, go read the full post @ bunnie’s blog – bio-the-bao-i-o-coprocessor/.
from Adafruit Industries – Makers, hackers, artists, designers and engineers! https://ift.tt/ZU3lSrM
via IFTTT
Комментариев нет:
Отправить комментарий