News / #hardware Tag Hardware 281 articles archived under #hardware · RSS Sign in to follow arXiv — Machine Learning research 22d ago Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters arXiv:2606.09924v1 Announce Type: new Abstract: Deploying deep neural networks on memory-constrained edge accelerators is bottlenecked by per-inference off-chip weight transfer rather than computation: the dense network cannot be retained on-chip, and every parameter must be… 29 arXiv — NLP / Computation & Language research 22d ago Which LoRA? An Empirical Study on the Effectiveness of LoRA Techniques During Multilingual Instruction Tuning arXiv:2606.10428v1 Announce Type: new Abstract: We investigate whether commonly available LoRA variants have an advantage over basic LoRA in multilingual instruction tuning. Experiments involving LoRA and four other variants on two datasets across diverse target languages show… 9 arXiv — NLP / Computation & Language research 22d ago Agentic Hybrid RAG for Evidence-Grounded Muon Collider Analysis arXiv:2606.10381v1 Announce Type: cross Abstract: Muon collider research spans accelerator physics, detector instrumentation, and high-energy phenomenology, with relevant evidence scattered across a rapidly expanding and heterogeneous body of scientific literature. As… 37 arXiv — NLP / Computation & Language research 22d ago SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference arXiv:2606.10445v1 Announce Type: cross Abstract: Semi-structured 2:4 sparsity is widely supported by modern accelerators, providing up to a 2x theoretical speedup. However, its strict 50% sparsity constraint often causes non-negligible accuracy degradation under post-training… 21 MIT News — AI research 22d ago Startup’s nuclear-inspired cooling system could make data centers more sustainable Founded by two researchers from MIT, Ferveret reduces the amount of energy and water required to cool the chips that power AI. 17 Hugging Face Daily Papers research 22d ago EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents Abstract EEVEE is a novel test-time prompt learning framework for LLM agents that handles heterogeneous data streams through task clustering and co-evolving router-prompt optimization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct In this paper, we propose EEVEE, the first… 6 The Information — AI news-outlet 22d ago OpenAI in Talks to Lease 10 Gigawatt Ohio Data Center with Backing From Nvidia OpenAI is in advanced negotiations to lease a proposed 10 gigawatt data center campus on federal land in Ohio as part of a deal that could include financial backing from Nvidia , according to two people with direct knowledge of the discussions. The campus under discussion would… 12 r/LocalLLaMA community 22d ago Furiosa AI selling inference chip to consumer market will be a game changer to local llm ​ This is south Korean start up all-in on inference chip: https://furiosa.ai/renegade-spec Tsmc 5nm node Hynix HBM3 1.5TB/s 48GB VRAM TDP 180W Already tested on LG LLM. If they opened their programming interface the way NVIDIA opens PTX and Intel opens SPIR-V, and team up… 12 Hugging Face Daily Papers research 22d ago Agents' Last Exam Abstract Agents' Last Exam (ALE) is a benchmark for evaluating AI agents on long-term, economically valuable real-world tasks across 13 industry clusters with 1K+ tasks, revealing significant gaps between benchmark performance and practical deployment. Generated by… 6 The Information — AI news-outlet 23d ago Broadcom to Help Finance Anthropic, OpenAI Chip Deals With Apollo, Blackstone Broadcom said Tuesday that it is launching a new fund—backed by Apollo and Blackstone—to help finance more than 20 gigawatts of AI data centers through 2028 using chips designed by Broadcom, including projects tied to Anthropic and OpenAI. Apollo will lead an initial $35 billion… 19 Google DeepMind official-blog 23d ago Powering the future of robotics in Europe Powering the future of robotics in Europe Jun 09, 2026 · Share x.com Facebook LinkedIn Mail Google DeepMind Accelerator selects 15 robotics companies from across Europe to join the program. Providing 3 months of intensive mentorship and technical support, enabling the… 22 r/LocalLLaMA community 23d ago Apple announced new on device inference engine for Apple Silicon This news seem to have flown under the radar. Apple announced CoreAI on WWDC which is basically a future replacement for CoreML and an alternative to MLX/llama.cpp/torch for on-device optimized inference, especially on phones and tablets. The model weights need to be converted… 25 TechCrunch — AI news-outlet 23d ago How an e-scooter founder raised $5 million to build space data centers Orbital founder Euwyn Poon built 250,000 scooters at Spin. Now he wants to launch 10,000 space data centers. 27 Hugging Face Daily Papers research 23d ago Text-to-Image Models Need Less from Text Encoders Than You Think Abstract Text-to-image models primarily utilize basic text representation aspects like word merging and order rather than complex contextual information encoded in full text embeddings. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Text-to-image models rely on text prompts as… 36 arXiv — Machine Learning research 23d ago The Routing Plateau: Understanding and Breaking the Accuracy Limits of LLM Routers arXiv:2606.07587v1 Announce Type: new Abstract: LLM routing has become a popular approach to improve the cost-quality trade-off of LLM services by dynamically selecting a model for each query. Recent work has explored a broad range of routing methods, including clustering-based… 29 arXiv — Machine Learning research 23d ago EssentialGIN: a new approach for gene essentiality prediction based on graph isomorphism neural networks arXiv:2606.07700v1 Announce Type: new Abstract: Background: Prediction of essential genes (proteins), is a basic and challenging problem but at the same time very costly and time-consuming in wet-lab experiments. Predicting essential genes, only based on computational methods… 5 r/LocalLLaMA community 23d ago New MLX LM Server From Apple Key Technical Advantages: Performance: The M5 chip's neural accelerators significantly boost prompt processing Concurrency: MLX LM Server utilizes continuous batching to handle multiple sub-agent requests simultaneously without stalling Scaling: For massive models that exceed… 15 NVIDIA Developer Blog official-blog 23d ago Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell Pre-training frontier LLMs comes down to throughput. When training spans trillions of tokens across thousands of accelerators, every percentage point of step... 34 Hacker News — AI on Front Page community 23d ago Show HN: Gitdot – A better GitHub. Open-source, written in Rust What works now: user signups, org creations, private/public repos, and importing GitHub repositories (both as read-only mirrors and full migrations). So basically, you can create, push and pull to a repo, but we don't have many features quite yet (issues, PRs, CI). What is a bit… 34 Hacker News — AI on Front Page community 24d ago A Farmer Donated Land to Turn into a Park. The City Is Building a Data Center Article URL: https://www.404media.co/a-farmer-donated-land-to-turn-into-a-park-the-city-is-building-a-massive-data-center-instead/ Comments URL: https://news.ycombinator.com/item?id=48446439 Points: 252 # Comments: 128 30 The Information — AI news-outlet 24d ago Developers of OpenAI’s Stargate Data Center Face Higher Costs On a dusty stretch of prairie in Abilene, Texas, anxious hardware engineers from Crusoe, a data center developer for OpenAI and Oracle, have been working overtime to get natural gas turbines to work harmoniously with one of the most expensive AI supercomputers in history. It has… 37 arXiv — Machine Learning research 24d ago Towards Serverless Semi-Decentralized Federated Learning with Heterogeneous Optimizers arXiv:2606.06687v1 Announce Type: new Abstract: We investigate cluster formation, involving the number and composition of clusters, in decentralized federated learning (FL) with heterogeneous machine learning (ML) optimizers. While clustering in centralized FL has enabled… 21 arXiv — Machine Learning research 24d ago SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling arXiv:2606.06820v1 Announce Type: new Abstract: Agentic Large Language Model (LLM) systems decompose complex tasks into workflow Directed Acyclic Graphs (DAGs) whose primitives must be scheduled on heterogeneous clusters. Existing deep reinforcement learning (DRL) schedulers are… 26 arXiv — Machine Learning research 24d ago Explaining Unsupervised Disease Staging in Huntington's Disease: Insights into Model Representations and Clusters arXiv:2606.07135v1 Announce Type: new Abstract: Huntington's disease (HD) is a progressive neurodegenerative disorder that affects motor, cognitive, and behavioral functions, where accurate characterization of disease progression remains essential to improve patient outcome and… 25 arXiv — Machine Learning research 24d ago Unsupervised Continual Clustering via Forward-Backward Knowledge Distillation arXiv:2606.07474v1 Announce Type: new Abstract: Unsupervised Continual Learning (UCL) aims to enable neural networks to learn sequential tasks without labels or access to past data. A major challenge in this setting is Catastrophic Forgetting, where models forget previously… 23 arXiv — NLP / Computation & Language research 24d ago Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations arXiv:2606.06740v1 Announce Type: cross Abstract: Discrete speech units obtained via k-means clustering of self supervised embeddings entangle phonetic, speaker, and language information, causing speaker mixing and cross-lingual interference in multilingual multi-speaker speech… 22 r/LocalLLaMA community 25d ago Clustering 3x Jetson Nano Orin Supers Hey everyone! Recently, I released a blog on how to setup a cluster out of your Raspberry Pi 4bs and Mac minis for distributed training and inference Now its time to do the same with Jetson Nano Orin Super! Why ? - 1024 CUDA Cores (Ampere) - 8GB unified memory LPDDR5 - 6x ARM… 26 Hacker News — AI on Front Page community 26d ago Google to pay SpaceX $920M a month for compute capacity at xAI data centers Article URL: https://www.cnbc.com/2026/06/05/google-to-pay-spacex-920-million-a-month-for-xai-compute-capacity.html Comments URL: https://news.ycombinator.com/item?id=48417490 Points: 231 # Comments: 804 38 Ars Technica — AI news-outlet 26d ago "We pissed off a lot of people": Giant data center plan cut 50% amid protests Developer felt "beaten up," with "no choice" but to shrink data center. 35 r/MachineLearning community 27d ago How do you identify researchers who are good? [D] About 10 years ago, I got into the basics of ML (like regression, KNN's, LVQ's) and read a few papers before taking a break a few years back. It feels like now, there's a lot of researchers in AI. How do you identify the ones who are actually solid vs those who (forgive my… 19 TechCrunch — AI news-outlet 27d ago AirTrunk commits $30B to build 5GW of AI data centers in India The Australian data center operator plans to set up 5GW of capacity in India. 14 arXiv — Machine Learning research 27d ago Staged Factorial Screening for Budget-Constrained Micro-Pretraining arXiv:2606.05186v1 Announce Type: new Abstract: Budget-constrained micro-pretraining often requires triaging many candidate recipes on a shared accelerator before larger search budgets are spent. We study whether a staged fractional-factorial workflow can recover stable early… 14 arXiv — Machine Learning research 27d ago A Machine Learning-Based Framework for Discovering Huntington's Disease Stages: Integrating Graph Representation Learning and clustering to Uncover Progression Dynamics in Longitudinal Enroll-HD Dataset arXiv:2606.06196v1 Announce Type: new Abstract: Huntington's disease (HD) is a progressive brain disorder that gradually affects movement, cognitive function, and behavior. Identifying the stage of the disease accurately and consistently is important for understanding its… 31 The Information — AI news-outlet 27d ago Data Center Developer Switch in Talks to Raise Billions at $50 Billion-Plus Valuation Data center developer Switch is in talks to raise billions of dollars at a valuation of at least $50 billion, a level that would make it one of the most valuable privately held data center operators, The Information reported late Thursday . Brookfield Asset Management, KKR and… 28 The Information — AI news-outlet 27d ago Data Center Developer Switch in Talks to Raise Billions at $50 Billion-Plus Valuation Data center developer Switch is in talks to raise billions of dollars at a valuation of at least $50 billion, as it seeks to capitalize on soaring demand for the infrastructure needed to support artificial intelligence, according to people with knowledge of the deal. Brookfield… 34 r/MachineLearning community 27d ago Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P] Hey everyone, The ARC Prize 2026 just launched the interactive ARC-AGI-3 track, and the collective AI world is panic-renting massive H100 clusters trying to get multi-billion parameter LLMs to navigate these dynamic environments. Predictably, out-of-the-box LLMs are faceplanting… 31 TechCrunch — AI news-outlet 27d ago Meta steals a tactic from Tesla and builds data centers in tents Meta may have one found one way to slash its massive data center bill: tents. 9 r/LocalLLaMA community 27d ago Qwen 3.6 27B 30GB Same top p: 98.358 ± 0.033 % vs UD Q8 K XL 33GB Same top p: 97.426 ± 0.041 % This is not a diss to Unsloth, they make great quants and really move this community forward. I've been experimenting with quanting specific sublayers based on which ones have the most outliers post Q8 quant. I basically did a BF16 to Q8_0 conversion and looked at the post quant… 8 Ars Technica — AI news-outlet 28d ago How some data center operators are tackling their water use problems Hyperscalers have come under scrutiny for their impact on water quality and availability. 7 The Information — AI news-outlet 28d ago Fusion Startup Helion Nearly Triples Valuation to $15.5 Billion in Thrive-led Round Helion Energy, a nuclear fusion startup backed by OpenAI’s Sam Altman, still has to prove it can produce electricity to serve data centers and other customers. But investors seem confident it can deliver. The Everett, Wash.–based company said it has raised $465 million in… 33 arXiv — Machine Learning research 28d ago Contrastive Learning and Correlation Clustering for Sequences of Network Telescope Data arXiv:2606.04733v1 Announce Type: new Abstract: Understanding activities of Internet scanners is challenging; it often requires identifying relationships between sources, a task for which semantic annotations are scarce. This work investigates whether semantically meaningful… 36 arXiv — NLP / Computation & Language research 28d ago Arithmetic Pedagogy for Language Models arXiv:2606.05106v1 Announce Type: new Abstract: We investigate whether methods of human mathematics pedagogy can guide the training of language models toward arithmetic reasoning. Building on the GASING method -- an Indonesian pedagogy that solves basic arithmetic through a… 32 arXiv — Machine Learning research 29d ago A Nonmonotone Gradient-Based Algorithm for Symmetric Nonnegative Matrix Factorization and Graph Clustering arXiv:2606.02887v1 Announce Type: new Abstract: Symmetric nonnegative matrix factorization (Symmetric NMF) approximates a matrix as $WW^T$ with nonnegative rectangular factor $W$. It has broad applications in graph clustering and machine learning. In contrast to the NMF,… 9 arXiv — Machine Learning research 29d ago KForge: LLM-Driven Cross-Platform Kernel Generation for AI Accelerators arXiv:2606.02963v1 Announce Type: new Abstract: Production inference increasingly targets a heterogeneous mix of accelerators. Agentic pipelines interleave reasoning, tool calls, and multi-agent coordination, each with distinct compute and memory profiles. For optimal… 19 arXiv — NLP / Computation & Language research 29d ago Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models arXiv:2606.03846v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate remarkable performance across diverse tasks, but they often generate responses that appear plausible while being factually incorrect. This problem is compounded by the lack of explicit… 13 r/LocalLLaMA community 29d ago I Put a Datacenter GPU in My Gaming PC for £200 Hey there! I wrote a blogpost about my experience running local models on a V100 from a newbie perspective and got loads of views outside of reddit, so I thought I'd share it here too!   submitted by   /u/tymscar [link]   [comments] 33 The Information — AI news-outlet 1mo ago SK Hynix to Double Capacity as AI Strains Memory Supply SK Hynix plans to double its memory chip capacity within five years as AI demand keeps straining global supply, Bloomberg reported. The expansion can ease one of the biggest hardware constraints facing AI data centers. Chairman Chey Tae-won said in Taipei that the memory crunch… 20 arXiv — Machine Learning research 1mo ago PE-means: Improved Differentially Private $k$-means Clustering through Private Evolution arXiv:2606.00342v1 Announce Type: new Abstract: We study the problem of differentially private (DP) $k$-means clustering in Euclidean space. Previous solutions rely on summing the private data directly, which induces a sensitivity proportional to the domain. We introduce… 17 arXiv — NLP / Computation & Language research 1mo ago French parsing enhanced with a word clustering method based on a syntactic lexicon arXiv:2606.00634v1 Announce Type: new Abstract: This article evaluates the integration of data extracted from a French syntactic lexicon, the Lexicon-Grammar (Gross, 1994), into a probabilistic parser. We show that by applying clustering methods on verbs of the French Treebank… 16 arXiv — NLP / Computation & Language research 1mo ago Agentic Clustering: Controllable Text Taxonomies via Multi-Agent Refinement arXiv:2606.01255v1 Announce Type: new Abstract: Recent text-clustering methods use large language models to propose a cluster taxonomy from a corpus and then assign each text to it. These pipelines are fundamentally programmatic: the sequence of LLM calls and the rules for… 37 Page 3 of 6 · 281 articles ← Newer Older →