News / #hardware Tag Hardware 281 articles archived under #hardware · RSS Sign in to follow Dwarkesh Podcast news-outlet 1mo ago Reiner Pope – Chip design from the bottom up Working up from basic logic gates to why GPUs, TPUs, FPGAs, and the human brain each look the way they do. 22 arXiv — Machine Learning research 1mo ago TONIC: Token-Centric Semantic Communication for Task-Oriented Wireless Systems arXiv:2605.21553v1 Announce Type: new Abstract: Tokens are becoming the basic units through which foundation models represent and process information for understanding and inference. However, traditional wireless communication, centered on bit-level fidelity, faces a mismatch… 33 r/LocalLLaMA community 1mo ago When your LLM treats data center GPUs like an optional DLC   submitted by   /u/noprompt [link]   [comments] 10 Hugging Face Daily Papers research 1mo ago Capturing LLM Capabilities via Evidence-Calibrated Query Clustering Abstract Query clustering algorithm ECC improves LLM capability evaluation by aligning semantic embeddings with latent capability demands through posterior model comparisons and Bradley-Terry modeling. AI-generated summary Query clustering organizes queries into groups that… 13 Ars Technica — AI news-outlet 1mo ago As Grok flounders, SpaceX bets future on beating Big Tech at AI SpaceX IPO filing pitches orbital data centers as Grok lags rival AI services. 26 r/LocalLLaMA community 1mo ago Qwen3.6 35Ba3 has changed my workflows and even how I use my computer My workflow has changed basically to ask Codex to do certain tasks and then document how to do them (including errors it found on its way) into a skill. I feed that skill to pi, and suddenly my qwen3.6 gets that hard stuff done: - devops on a VPS - using docling to create epubs… 33 Google DeepMind official-blog 1mo ago We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks The Asia-Pacific region is a global engine for economic growth, but it's also highly vulnerable to climate change. While green technologies are gaining momentum, a recent report shows they aren’t scaling fast enough to keep up with the region’s rising environmental risks. To… 22 r/MachineLearning community 1mo ago Can liveness detection models generalise to synthetic media generation techniques they were never trained on? [D] Most liveness detection systems in production today were built around a threat model where the attacker is submitting a static image or a basic replay video. The generation quality of current synthetic media is categorically different from what those training datasets captured.… 32 NVIDIA Developer Blog official-blog 1mo ago Get Real-Time Visibility into GPU Usage Across Kubernetes Clusters Maximizing the value of AI infrastructure demands deep visibility into GPU utilization. Yet many platform teams running AI workloads on Kubernetes operate with... 25 r/MachineLearning community 1mo ago I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R] RPS is inspired by neuroscience. As humans, we learn basic skills as kids with high neuro-plasticity. We then learn advanced skills as teens and adults with low neuro-plasticity. RPS trains a model in 2 stages. In stage 1, the model is trained on easy data with high learning… 26 Hugging Face Daily Papers research 1mo ago CutVerse: A Compositional GUI Agents Benchmark for Media Post-Production Editing Abstract Current GUI agents show limited effectiveness in professional media post-production tasks despite advances in spatial grounding and multimodal alignment. AI-generated summary While GUI agents have made significant progress in web navigation and basic operating system… 13 arXiv — Machine Learning research 1mo ago Unsupervised clustering and classification of upper limb EMG signals during functional movements: a data-driven arXiv:2605.20599v1 Announce Type: new Abstract: This study presents a comprehensive approach for the clustering and classification of upper-limb surface electromyography (sEMG) signals during functional reach and grasp movements. The methodology was applied to the NINAPRO DB4… 18 arXiv — NLP / Computation & Language research 1mo ago Post-Hoc Understanding of Metaphor Processing in Decoder-Only Language Models via Conditional Scale Entropy arXiv:2605.21391v1 Announce Type: new Abstract: Metaphor requires a language model to resolve a token whose contextual meaning diverges from its basic literal sense. Understanding how transformer models organize this reinterpretation across depth remains an open problem in… 19 The Information — AI news-outlet 1mo ago Anthropic and SpaceX Detail Compute Deal Worth Up to $40 Billion Anthropic could pay SpaceX up to $40 billion over the next several years to use compute from data centers, but either company has the power to call off the deal early, SpaceX revealed when it filed for an initial public offering on Wednesday. SpaceX is receiving $1.25 billion… 16 Latent.Space news-outlet 1mo ago Railway: The Agent-Native Cloud — Jake Cooper 3M Users, 100K Signups/Week, Own-Metal Data Centers, $200K+ Coding Agent Spend, and the Death of PRs 21 TechCrunch — AI news-outlet 1mo ago Musk’s xAI is being sued over its data center generators. Now, it’s buying $2.8B more. Elon Muks's xAI said it will buy $2.8 billion worth of natural gas turbines over the next three years, according to SpaceX's IPO filing. 6 r/LocalLLaMA community 1mo ago 24GB M4 Mac - is Qwen 9B only option while system is running? I have mac at work that I want to use local model for prototyping and basic prompts that needs to stay on device. What sort of model I can run that I can fit at least 64k context ? Any setups share or guides welcome. I need to have firefox open with one tab at minium. Problem I… 6 The Information — AI news-outlet 1mo ago Sam Altman Offers YC Founders $2 Million in OpenAI Tokens For Equity OpenAI cofounder and CEO Sam Altman late Tuesday offered to invest $2 million in every startup currently in the Y Combinator startup accelerator program—not in cash, but in OpenAI tokens. “I am excited to see what will happen with tokenmaxxing startups, both for how they work… 13 arXiv — Machine Learning research 1mo ago DynaTrain: Fast Online Parallelism Switching for Elastic LLM Training arXiv:2605.18815v1 Announce Type: new Abstract: Modern large language model (LLM) training is inherently dynamic: resource fluctuations, RLHF phase shifts, and cluster elasticity continually reshape the optimal parallelism layout, posing a significant challenge to existing… 22 arXiv — Machine Learning research 1mo ago A Multi-Dimensional Clustering Approach for Identifying Inborn Errors of Immunity arXiv:2605.18880v1 Announce Type: new Abstract: Rare diseases such as inborn errors of immunity (IEI) require early diagnosis to prevent end organ damage and improve quality of life. Hurdles in accessing and curating large scale electronic health record (EHR) data limit routine… 10 arXiv — NLP / Computation & Language research 1mo ago Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering arXiv:2605.19220v1 Announce Type: new Abstract: Uncertainty Quantification (UQ) is widely regarded as the primary safeguard for deploying Large Language Models (LLMs) in high-stakes domains. However, we argue that the field suffers from a category error: mainstream UQ methods… 22 arXiv — NLP / Computation & Language research 1mo ago ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation arXiv:2605.18769v1 Announce Type: cross Abstract: Personalized Retrieval-Augmented Generation (RAG) relies on accurately selecting user-relevant documents. In practice, existing RAG approaches often suffer from high retrieval costs and overlook that collaborative signals from… 36 r/LocalLLaMA community 1mo ago Running DeepSeek-V4 locally with 4x legacy RTX 2080 Ti ($2k budget setup). Custom Turing kernels, W8A8 quantization, and 255 prefill tok/s! Hey r/DeepSeek , Who says we need an H100 cluster or the latest expensive GPUs to run frontier MoE models? I wanted to see how far we could push a single node of consumer legacy hardware, so we spent less than $2,500 total to build a budget machine that successfully runs… 29 r/LocalLLaMA community 1mo ago Intel's Crescent Island PCB Leaks, Showing a Massive Xe3P GPU, 16-Pin Connector, 160GB LPDDR5X as Intel Sidesteps the HBM Shortage Upcoming Intel Xe3P data center GPU with 20 8GBLPDDR5X modules for a total of 160GB, bypassing HBM shortages. Assuming a 32-bit interface, that's a 640-bit wide memory interface, or 10 channel memory interface if converted to the 64-bit wide desktop equivalent. At 8800-9500MT,… 35 Ars Technica — AI news-outlet 1mo ago Electrical utility megamerger is all about the data centers NextEra’s blockbuster deal with Dominion likely means higher bills for consumers. 29 arXiv — Machine Learning research 1mo ago AdaGraph: A Graph-Native Clustering Algorithm That Overcomes the Curse of Dimensionality and Enables Scientific Discovery arXiv:2605.16320v1 Announce Type: new Abstract: We present AdaGraph, a graph-native clustering algorithm born from the Structure-Centric Machine Learning (SC-ML) paradigm -- a new field of unsupervised learning that replaces geometry-centric (distance-based) computation with… 24 arXiv — Machine Learning research 1mo ago HPC-LLM: Practical Domain Adaptation and Retrieval-Augmented Generation for HPC Support arXiv:2605.16347v1 Announce Type: new Abstract: Modern scientific research increasingly depends on High-Performance Computing (HPC) infrastructures, yet many researchers face significant operational barriers when interacting with cluster environments, job schedulers, GPU… 12 The Information — AI news-outlet 1mo ago The ‘Price is Right’ for GPUs: The Startup Turning Nvidia Chips Into ‘Boring’ Bankable Assets Good morning, Anissa here! The AI boom requires an enormous amount of new power plants, chips and data centers. But it also requires something that’s less visible: new financial plumbing that makes it possible to lend against this costly hardware. Miami fintech startup Barkr is… 38 The Information — AI news-outlet 1mo ago Edge Inference Chip Startup SiMa.ai Raising at $1.4 Billion Valuation Nvidia might be on a tear, but some investors are still convinced that there’s demand for another kind of specialized chips. And they’re putting their money where their mouth is. For example: San Jose, Calif.-based SiMa.ai , which develops chips that work on devices such as… 14 Stratechery (Ben Thompson) community 1mo ago Data Center Discontent, Understanding the Opposition, Fixing the Problem There are understandable reasons for people to oppose data centers; the only solution that will work is simply paying them off. 4 arXiv — NLP / Computation & Language research 1mo ago Automatic Construction of a Legal Citation Graph from 100 Million Ukrainian Court Decisions: Large-Scale Extraction, Topological Analysis, and Ontology-Driven Clustering arXiv:2605.15362v1 Announce Type: new Abstract: Half a billion citation edges extracted from 100.7 million Ukrainian court decisions reveal that judicial citation structure encodes legal domain boundaries without supervision and predicts future legislative importance with… 19 r/LocalLLaMA community 1mo ago Qwen 3.6 27B Q8 on four Nvidia RTX A4000 (16GB each) with Llama.cpp and MTP enabled Qwen 3.6 27B Q8 on four Nvidia RTX A4000 (16GB each) with Llama.cpp and MTP enabled My setup is heterogenous, I originally acquired my server (Lenovo ThinkStation P3 Tower Gen 2) to run OpenShift/K8s clusters (because I work on that), and later on I started purchasing one by one… 14 r/LocalLLaMA community 1mo ago I trained TIME: short context-triggered thinking on Qwen model instead of overthinking Started this as a personal project for my Open-WebUI setup to use. Somehow it ended up as an ACL 2026 paper. Not some lab paper, it is personal solo independent paper that happened. TIME is basically my attempt to train Qwen3 models to think in short bursts wherever the response… 29 r/LocalLLaMA community 1mo ago Benchmarking vLLM vs SGLang vs llama.cpp on a mixed Blackwell/Ada cluster I have been running some benchmarks on a heterogeneous 7-GPU cluster to see how different inference engines handle long context prefill using pipeline parallelism. My setup consists of a mix of Blackwell and Ada cards: one RTX PRO 6000 96GB, one PRO 5000 48GB, two 5090 32GB, and… 4 r/LocalLLaMA community 1mo ago The power of structured workflows and small local models A month ago, I experimented with a very basic home-rolled agent loop with a handful of tools and found it worked surprisingly well in spite of how crude it was: https://www.reddit.com/r/LocalLLaMA/comments/1sl7f8e/homerolled_loop_agent_is_surprisingly_effective/ Later, I wrote… 15 Dwarkesh Podcast news-outlet 1mo ago Notes on pretraining parallelisms and failed training runs. Deeply researched interviews 37 Hacker News — AI on Front Page community 1mo ago Bun Rust rewrite: "codebase fails basic miri checks, allows for UB in safe rust" Article URL: https://github.com/oven-sh/bun/issues/30719 Comments URL: https://news.ycombinator.com/item?id=48150900 Points: 246 # Comments: 154 22 The Algorithmic Bridge news-outlet 1mo ago Weekly Top Picks #121 Hottest AI job: FDEs / Trump in China / No jobpocalypse / AI models fixing benchmarks / Americans agree: "no datacenters here" / Claude 3 vs Claude 4 14 Ars Technica — AI news-outlet 1mo ago Pennsylvanians use town hall meeting to rail against data center boom “This is a public trust and transparency issue.” 22 Hacker News — AI on Front Page community 1mo ago Prolog Basics Explained with Pokémon Article URL: https://unplannedobsolescence.com/blog/prolog-basics-pokemon/ Comments URL: https://news.ycombinator.com/item?id=48147091 Points: 215 # Comments: 34 34 r/LocalLLaMA community 1mo ago Important (vision) Qwen3.5 template fix dropped in vllm Sharing this because I personally had some annoying issues and I can confirm this un-fucked them. Basically once you posted an image in the conversation the model went haywire. Not too badly but annoying   submitted by   /u/Dany0 [link]   [comments] 14 arXiv — Machine Learning research 1mo ago Towards Resource-Efficient LLMs: End-to-End Energy Accounting of Distillation Pipelines arXiv:2605.13981v1 Announce Type: new Abstract: The rise in deployment of large language models has driven a surge in GPU demand and datacenter scaling, raising concerns about electricity use, grid stress, and the impacts of modern AI workloads. Distillation is often promoted as… 19 arXiv — Machine Learning research 1mo ago EnergyLens: Predictive Energy-Aware Exploration for Multi-GPU LLM Inference Optimization arXiv:2605.14249v1 Announce Type: new Abstract: We present EnergyLens, an end-to-end framework for energy-aware large language model (LLM) inference optimization. As LLMs scale, predicting and reducing their energy footprint has become critical for sustainability and datacenter… 12 arXiv — Machine Learning research 1mo ago LoMETab: Beyond Rank-1 Ensembles for Tabular Deep Learning arXiv:2605.14365v1 Announce Type: new Abstract: Recent tabular learning benchmarks increasingly show a tight performance cluster rather than a clear hierarchy among leading methods, spanning gradient boosted decision trees, attention-based architectures, and implicit ensembles… 4 Hugging Face Daily Papers research 1mo ago LLM-based Detection of Manipulative Political Narratives Abstract A computational framework combining prompt-based filtering and unsupervised clustering identifies manipulative political narrative clusters from social media posts without requiring predefined categories. AI-generated summary We present a new computational framework for… 9 The Information — AI news-outlet 1mo ago Newmark Data Center Advisor Brent Mayo Departs for DigitalBridge Brent Mayo, head of data center capital markets at advisory firm Newmark, has left the firm and told people he is joining investment firm DigitalBridge, according to two people with knowledge of the move. At Newmark, which specialized in commercial real estate, Mayo was a part… 26 Hugging Face Daily Papers research 1mo ago Federation of Experts: Communication Efficient Distributed Inference for Large Language Models Abstract Federation of Experts restructures mixture of experts blocks into clusters that process KV heads independently, eliminating inter-node communication bottlenecks while maintaining generation quality. AI-generated summary Mixture of experts has emerged as the primary… 23 Ars Technica — AI news-outlet 1mo ago Energy supplier abandons Lake Tahoe residents to serve data centers Town’s 49,000 California residents compete with Nevada data centers for energy. 19 r/MachineLearning community 1mo ago OpenAI's deployment company move says more about the AI gap than any benchmark[D] OpenAI launched a deployment company with $4B initial investment, 19 partner organizations, and acquired Tomoro (UK-based AI consultancy, ~150 engineers). The pitch: embed "Forward Deployed Engineers" into enterprises to help them actually use AI. This is basically the Palantir… 35 arXiv — Machine Learning research 1mo ago scShapeBench: Discovering geometry from high dimensional scRNAseq data arXiv:2605.12662v1 Announce Type: new Abstract: High-dimensional point cloud data arise across many scientific domains, especially single-cell biology. The shapes or topologies of these datasets determine the types of information that can be extracted. For example, clustered… 32 Page 5 of 6 · 281 articles ← Newer Older →