
Insights from IEDM 2025
08.1.2026 | 42 Min.
Austin and Vik discuss key insights from the IEDM conference. They explore the significance of IEDM for engineers and investors, the networking opportunities it offers, and the latest innovations in silicon photonics, complementary FETs, NAND flash memory, and GaN-on-silicon chiplets. TakeawaysPenta-level NAND flash memory could disrupt the SSD marketGaN-on-Silicon chiplets enhance power efficiencyComplementary FETsOptical scale-up has a power problemThe future of transistors is still bright

Nvidia "Acquires" Groq
05.1.2026 | 40 Min.
Key TopicsWhat Nvidia actually bought from Groq and why it is not a traditional acquisitionWhy the deal triggered claims that GPUs and HBM are obsoleteArchitectural trade-offs between GPUs, TPUs, XPUs, and LPUsSRAM vs HBM. Speed, capacity, cost, and supply chain realitiesGroq LPU fundamentals: VLIW, compiler-scheduled execution, determinism, ultra-low latencyWhy LPUs struggle with large models and where they excel insteadPractical use cases for hyper-low-latency inference:Ad copy personalization at search latency budgetsModel routing and agent orchestrationConversational interfaces and real-time translationRobotics and physical AI at the edgePotential applications in AI-RAN and telecom infrastructureMemory as a design spectrum: SRAM-only, SRAM plus DDR, SRAM plus HBMNvidia’s growing portfolio approach to inference hardware rather than one-size-fits-allCore TakeawaysGPUs are not dead. HBM is not dead.LPUs solve a different problem: deterministic, ultra-low-latency inference for small models.Large frontier models still require HBM-based systems.Nvidia’s move expands its inference portfolio surface area rather than replacing GPUs.The future of AI infrastructure is workload-specific optimization and TCO-driven deployment.

Nvidia CES 2026
05.1.2026 | 47 Min.
Episode SummaryAustin and Vik break down NVIDIA’s CES 2026 keynote, focusing on Vera Rubin, DGX Spark and DGX Station, uneducated investor panic, and physical AI.Key TakeawaysDGX Spark brings server-class NVIDIA architecture to the desktop at low power, aimed at developers, enthusiasts, and enterprises experimenting locally. DGX Station functions more like a mini-AI rack on-prem: Grace Blackwell for inference and development without full racks The historical parallel is mainframes to minicomputers, expanding compute TAM rather than displacing cloud usage. On-prem AI converts some GPU rental OpEx into CapEx, appealing to CFOs NVIDIA positioned autonomy as physical AI with vision-language-action models and early Mercedes-Benz deployments in 2026. Vera Rubin integrates CPU, GPU, DPU, networking, and photonics into a single platform, emphasizing Ethernet for scale-out. (Where was the Infiniband switch?) The new Vera CPU highlights rising CPU importance for agentic workloads through higher core counts, SMT, and large LPDDR capacity. Rubin GPU’s move to HBM4 and adaptive precision targets inference efficiency gains and lower cost per token. Context memory storage elevates SSDs and DPUs, enabling massive KV cache offload beyond HBM and DRAM. Cable-less rack design and warm-water cooling show NVIDIA’s shift from raw performance toward manufacturability and enterprise polish.



Semi Doped