Tag

Hardware

281 articles archived under #hardware · RSS

arXiv — Machine Learning research 22d ago

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

arXiv:2606.09924v1 Announce Type: new Abstract: Deploying deep neural networks on memory-constrained edge accelerators is bottlenecked by per-inference off-chip weight transfer rather than computation: the dense network cannot be retained on-chip, and every parameter must be…

29
arXiv — NLP / Computation & Language research 22d ago

Which LoRA? An Empirical Study on the Effectiveness of LoRA Techniques During Multilingual Instruction Tuning

arXiv:2606.10428v1 Announce Type: new Abstract: We investigate whether commonly available LoRA variants have an advantage over basic LoRA in multilingual instruction tuning. Experiments involving LoRA and four other variants on two datasets across diverse target languages show…

9
arXiv — NLP / Computation & Language research 22d ago

Agentic Hybrid RAG for Evidence-Grounded Muon Collider Analysis

arXiv:2606.10381v1 Announce Type: cross Abstract: Muon collider research spans accelerator physics, detector instrumentation, and high-energy phenomenology, with relevant evidence scattered across a rapidly expanding and heterogeneous body of scientific literature. As…

37
arXiv — NLP / Computation & Language research 22d ago

SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference

arXiv:2606.10445v1 Announce Type: cross Abstract: Semi-structured 2:4 sparsity is widely supported by modern accelerators, providing up to a 2x theoretical speedup. However, its strict 50% sparsity constraint often causes non-negligible accuracy degradation under post-training…

21
MIT News — AI research 22d ago

Startup’s nuclear-inspired cooling system could make data centers more sustainable

Founded by two researchers from MIT, Ferveret reduces the amount of energy and water required to cool the chips that power AI.

17
Hugging Face Daily Papers research 22d ago

EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents

Abstract EEVEE is a novel test-time prompt learning framework for LLM agents that handles heterogeneous data streams through task clustering and co-evolving router-prompt optimization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct In this paper, we propose EEVEE, the first…

6
The Information — AI news-outlet 22d ago

OpenAI in Talks to Lease 10 Gigawatt Ohio Data Center with Backing From Nvidia

OpenAI is in advanced negotiations to lease a proposed 10 gigawatt data center campus on federal land in Ohio as part of a deal that could include financial backing from Nvidia , according to two people with direct knowledge of the discussions. The campus under discussion would…

12
r/LocalLLaMA community 22d ago

Furiosa AI selling inference chip to consumer market will be a game changer to local llm

 This is south Korean start up all-in on inference chip: https://furiosa.ai/renegade-spec Tsmc 5nm node Hynix HBM3 1.5TB/s 48GB VRAM TDP 180W Already tested on LG LLM. If they opened their programming interface the way NVIDIA opens PTX and Intel opens SPIR-V, and team up…

12
Hugging Face Daily Papers research 22d ago

Agents' Last Exam

Abstract Agents' Last Exam (ALE) is a benchmark for evaluating AI agents on long-term, economically valuable real-world tasks across 13 industry clusters with 1K+ tasks, revealing significant gaps between benchmark performance and practical deployment. Generated by…

6
The Information — AI news-outlet 23d ago

Broadcom to Help Finance Anthropic, OpenAI Chip Deals With Apollo, Blackstone

Broadcom said Tuesday that it is launching a new fund—backed by Apollo and Blackstone—to help finance more than 20 gigawatts of AI data centers through 2028 using chips designed by Broadcom, including projects tied to Anthropic and OpenAI. Apollo will lead an initial $35 billion…

19
Google DeepMind official-blog 23d ago

Powering the future of robotics in Europe

Powering the future of robotics in Europe Jun 09, 2026 · Share x.com Facebook LinkedIn Mail Google DeepMind Accelerator selects 15 robotics companies from across Europe to join the program. Providing 3 months of intensive mentorship and technical support, enabling the…

22
r/LocalLLaMA community 23d ago

Apple announced new on device inference engine for Apple Silicon

This news seem to have flown under the radar. Apple announced CoreAI on WWDC which is basically a future replacement for CoreML and an alternative to MLX/llama.cpp/torch for on-device optimized inference, especially on phones and tablets. The model weights need to be converted…

25
TechCrunch — AI news-outlet 23d ago

How an e-scooter founder raised $5 million to build space data centers

Orbital founder Euwyn Poon built 250,000 scooters at Spin. Now he wants to launch 10,000 space data centers.

27
Hugging Face Daily Papers research 23d ago

Text-to-Image Models Need Less from Text Encoders Than You Think

Abstract Text-to-image models primarily utilize basic text representation aspects like word merging and order rather than complex contextual information encoded in full text embeddings. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Text-to-image models rely on text prompts as…

36
arXiv — Machine Learning research 23d ago

The Routing Plateau: Understanding and Breaking the Accuracy Limits of LLM Routers

arXiv:2606.07587v1 Announce Type: new Abstract: LLM routing has become a popular approach to improve the cost-quality trade-off of LLM services by dynamically selecting a model for each query. Recent work has explored a broad range of routing methods, including clustering-based…

29
arXiv — Machine Learning research 23d ago

EssentialGIN: a new approach for gene essentiality prediction based on graph isomorphism neural networks

arXiv:2606.07700v1 Announce Type: new Abstract: Background: Prediction of essential genes (proteins), is a basic and challenging problem but at the same time very costly and time-consuming in wet-lab experiments. Predicting essential genes, only based on computational methods…

5
r/LocalLLaMA community 23d ago

New MLX LM Server From Apple

Key Technical Advantages: Performance: The M5 chip's neural accelerators significantly boost prompt processing Concurrency: MLX LM Server utilizes continuous batching to handle multiple sub-agent requests simultaneously without stalling Scaling: For massive models that exceed…

15
NVIDIA Developer Blog official-blog 23d ago

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

Pre-training frontier LLMs comes down to throughput. When training spans trillions of tokens across thousands of accelerators, every percentage point of step...

34
Hacker News — AI on Front Page community 23d ago

Show HN: Gitdot – A better GitHub. Open-source, written in Rust

What works now: user signups, org creations, private/public repos, and importing GitHub repositories (both as read-only mirrors and full migrations). So basically, you can create, push and pull to a repo, but we don't have many features quite yet (issues, PRs, CI). What is a bit…

34
Hacker News — AI on Front Page community 24d ago

A Farmer Donated Land to Turn into a Park. The City Is Building a Data Center

Article URL: https://www.404media.co/a-farmer-donated-land-to-turn-into-a-park-the-city-is-building-a-massive-data-center-instead/ Comments URL: https://news.ycombinator.com/item?id=48446439 Points: 252 # Comments: 128

30
The Information — AI news-outlet 24d ago

Developers of OpenAI’s Stargate Data Center Face Higher Costs

On a dusty stretch of prairie in Abilene, Texas, anxious hardware engineers from Crusoe, a data center developer for OpenAI and Oracle, have been working overtime to get natural gas turbines to work harmoniously with one of the most expensive AI supercomputers in history. It has…

37
arXiv — Machine Learning research 24d ago

Towards Serverless Semi-Decentralized Federated Learning with Heterogeneous Optimizers

arXiv:2606.06687v1 Announce Type: new Abstract: We investigate cluster formation, involving the number and composition of clusters, in decentralized federated learning (FL) with heterogeneous machine learning (ML) optimizers. While clustering in centralized FL has enabled…

21
arXiv — Machine Learning research 24d ago

SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling

arXiv:2606.06820v1 Announce Type: new Abstract: Agentic Large Language Model (LLM) systems decompose complex tasks into workflow Directed Acyclic Graphs (DAGs) whose primitives must be scheduled on heterogeneous clusters. Existing deep reinforcement learning (DRL) schedulers are…

26
arXiv — Machine Learning research 24d ago

Explaining Unsupervised Disease Staging in Huntington's Disease: Insights into Model Representations and Clusters

arXiv:2606.07135v1 Announce Type: new Abstract: Huntington's disease (HD) is a progressive neurodegenerative disorder that affects motor, cognitive, and behavioral functions, where accurate characterization of disease progression remains essential to improve patient outcome and…

25
arXiv — Machine Learning research 24d ago

Unsupervised Continual Clustering via Forward-Backward Knowledge Distillation

arXiv:2606.07474v1 Announce Type: new Abstract: Unsupervised Continual Learning (UCL) aims to enable neural networks to learn sequential tasks without labels or access to past data. A major challenge in this setting is Catastrophic Forgetting, where models forget previously…

23
arXiv — NLP / Computation & Language research 24d ago

Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations

arXiv:2606.06740v1 Announce Type: cross Abstract: Discrete speech units obtained via k-means clustering of self supervised embeddings entangle phonetic, speaker, and language information, causing speaker mixing and cross-lingual interference in multilingual multi-speaker speech…

22
r/LocalLLaMA community 25d ago

Clustering 3x Jetson Nano Orin Supers

Hey everyone! Recently, I released a blog on how to setup a cluster out of your Raspberry Pi 4bs and Mac minis for distributed training and inference Now its time to do the same with Jetson Nano Orin Super! Why ? - 1024 CUDA Cores (Ampere) - 8GB unified memory LPDDR5 - 6x ARM…

26
Hacker News — AI on Front Page community 26d ago

Google to pay SpaceX $920M a month for compute capacity at xAI data centers

Article URL: https://www.cnbc.com/2026/06/05/google-to-pay-spacex-920-million-a-month-for-xai-compute-capacity.html Comments URL: https://news.ycombinator.com/item?id=48417490 Points: 231 # Comments: 804

38
Ars Technica — AI news-outlet 26d ago

"We pissed off a lot of people": Giant data center plan cut 50% amid protests

Developer felt "beaten up," with "no choice" but to shrink data center.

35
r/MachineLearning community 27d ago

How do you identify researchers who are good? [D]

About 10 years ago, I got into the basics of ML (like regression, KNN's, LVQ's) and read a few papers before taking a break a few years back. It feels like now, there's a lot of researchers in AI. How do you identify the ones who are actually solid vs those who (forgive my…

19
TechCrunch — AI news-outlet 27d ago

AirTrunk commits $30B to build 5GW of AI data centers in India

The Australian data center operator plans to set up 5GW of capacity in India.

14
arXiv — Machine Learning research 27d ago

Staged Factorial Screening for Budget-Constrained Micro-Pretraining

arXiv:2606.05186v1 Announce Type: new Abstract: Budget-constrained micro-pretraining often requires triaging many candidate recipes on a shared accelerator before larger search budgets are spent. We study whether a staged fractional-factorial workflow can recover stable early…

14
arXiv — Machine Learning research 27d ago

A Machine Learning-Based Framework for Discovering Huntington's Disease Stages: Integrating Graph Representation Learning and clustering to Uncover Progression Dynamics in Longitudinal Enroll-HD Dataset

arXiv:2606.06196v1 Announce Type: new Abstract: Huntington's disease (HD) is a progressive brain disorder that gradually affects movement, cognitive function, and behavior. Identifying the stage of the disease accurately and consistently is important for understanding its…

31
The Information — AI news-outlet 27d ago

Data Center Developer Switch in Talks to Raise Billions at $50 Billion-Plus Valuation

Data center developer Switch is in talks to raise billions of dollars at a valuation of at least $50 billion, a level that would make it one of the most valuable privately held data center operators, The Information reported late Thursday . Brookfield Asset Management, KKR and…

28
The Information — AI news-outlet 27d ago

Data Center Developer Switch in Talks to Raise Billions at $50 Billion-Plus Valuation

Data center developer Switch is in talks to raise billions of dollars at a valuation of at least $50 billion, as it seeks to capitalize on soaring demand for the infrastructure needed to support artificial intelligence, according to people with knowledge of the deal. Brookfield…

34
r/MachineLearning community 27d ago

Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]

Hey everyone, The ARC Prize 2026 just launched the interactive ARC-AGI-3 track, and the collective AI world is panic-renting massive H100 clusters trying to get multi-billion parameter LLMs to navigate these dynamic environments. Predictably, out-of-the-box LLMs are faceplanting…

31
TechCrunch — AI news-outlet 27d ago

Meta steals a tactic from Tesla and builds data centers in tents

Meta may have one found one way to slash its massive data center bill: tents.

9
r/LocalLLaMA community 27d ago

Qwen 3.6 27B 30GB Same top p: 98.358 ± 0.033 % vs UD Q8 K XL 33GB Same top p: 97.426 ± 0.041 %

This is not a diss to Unsloth, they make great quants and really move this community forward. I've been experimenting with quanting specific sublayers based on which ones have the most outliers post Q8 quant. I basically did a BF16 to Q8_0 conversion and looked at the post quant…

8
Ars Technica — AI news-outlet 28d ago

How some data center operators are tackling their water use problems

Hyperscalers have come under scrutiny for their impact on water quality and availability.

7
The Information — AI news-outlet 28d ago

Fusion Startup Helion Nearly Triples Valuation to $15.5 Billion in Thrive-led Round

Helion Energy, a nuclear fusion startup backed by OpenAI’s Sam Altman, still has to prove it can produce electricity to serve data centers and other customers. But investors seem confident it can deliver. The Everett, Wash.–based company said it has raised $465 million in…

33
arXiv — Machine Learning research 28d ago

Contrastive Learning and Correlation Clustering for Sequences of Network Telescope Data

arXiv:2606.04733v1 Announce Type: new Abstract: Understanding activities of Internet scanners is challenging; it often requires identifying relationships between sources, a task for which semantic annotations are scarce. This work investigates whether semantically meaningful…

36
arXiv — NLP / Computation & Language research 28d ago

Arithmetic Pedagogy for Language Models

arXiv:2606.05106v1 Announce Type: new Abstract: We investigate whether methods of human mathematics pedagogy can guide the training of language models toward arithmetic reasoning. Building on the GASING method -- an Indonesian pedagogy that solves basic arithmetic through a…

32
arXiv — Machine Learning research 29d ago

A Nonmonotone Gradient-Based Algorithm for Symmetric Nonnegative Matrix Factorization and Graph Clustering

arXiv:2606.02887v1 Announce Type: new Abstract: Symmetric nonnegative matrix factorization (Symmetric NMF) approximates a matrix as $WW^T$ with nonnegative rectangular factor $W$. It has broad applications in graph clustering and machine learning. In contrast to the NMF,…

9
arXiv — Machine Learning research 29d ago

KForge: LLM-Driven Cross-Platform Kernel Generation for AI Accelerators

arXiv:2606.02963v1 Announce Type: new Abstract: Production inference increasingly targets a heterogeneous mix of accelerators. Agentic pipelines interleave reasoning, tool calls, and multi-agent coordination, each with distinct compute and memory profiles. For optimal…

19
arXiv — NLP / Computation & Language research 29d ago

Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models

arXiv:2606.03846v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate remarkable performance across diverse tasks, but they often generate responses that appear plausible while being factually incorrect. This problem is compounded by the lack of explicit…

13
r/LocalLLaMA community 29d ago

I Put a Datacenter GPU in My Gaming PC for £200

Hey there! I wrote a blogpost about my experience running local models on a V100 from a newbie perspective and got loads of views outside of reddit, so I thought I'd share it here too!   submitted by   /u/tymscar [link]   [comments]

33
The Information — AI news-outlet 1mo ago

SK Hynix to Double Capacity as AI Strains Memory Supply

SK Hynix plans to double its memory chip capacity within five years as AI demand keeps straining global supply, Bloomberg reported. The expansion can ease one of the biggest hardware constraints facing AI data centers. Chairman Chey Tae-won said in Taipei that the memory crunch…

20
arXiv — Machine Learning research 1mo ago

PE-means: Improved Differentially Private $k$-means Clustering through Private Evolution

arXiv:2606.00342v1 Announce Type: new Abstract: We study the problem of differentially private (DP) $k$-means clustering in Euclidean space. Previous solutions rely on summing the private data directly, which induces a sensitivity proportional to the domain. We introduce…

17
arXiv — NLP / Computation & Language research 1mo ago

French parsing enhanced with a word clustering method based on a syntactic lexicon

arXiv:2606.00634v1 Announce Type: new Abstract: This article evaluates the integration of data extracted from a French syntactic lexicon, the Lexicon-Grammar (Gross, 1994), into a probabilistic parser. We show that by applying clustering methods on verbs of the French Treebank…

16
arXiv — NLP / Computation & Language research 1mo ago

Agentic Clustering: Controllable Text Taxonomies via Multi-Agent Refinement

arXiv:2606.01255v1 Announce Type: new Abstract: Recent text-clustering methods use large language models to propose a cluster taxonomy from a corpus and then assign each text to it. These pipelines are fundamentally programmatic: the sequence of LLM calls and the rules for…

37

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Which LoRA? An Empirical Study on the Effectiveness of LoRA Techniques During Multilingual Instruction Tuning

Agentic Hybrid RAG for Evidence-Grounded Muon Collider Analysis

SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference

Startup’s nuclear-inspired cooling system could make data centers more sustainable

EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents

OpenAI in Talks to Lease 10 Gigawatt Ohio Data Center with Backing From Nvidia

Furiosa AI selling inference chip to consumer market will be a game changer to local llm

Agents' Last Exam

Broadcom to Help Finance Anthropic, OpenAI Chip Deals With Apollo, Blackstone

Powering the future of robotics in Europe

Apple announced new on device inference engine for Apple Silicon

How an e-scooter founder raised $5 million to build space data centers

Text-to-Image Models Need Less from Text Encoders Than You Think

The Routing Plateau: Understanding and Breaking the Accuracy Limits of LLM Routers

EssentialGIN: a new approach for gene essentiality prediction based on graph isomorphism neural networks

New MLX LM Server From Apple

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

Show HN: Gitdot – A better GitHub. Open-source, written in Rust

A Farmer Donated Land to Turn into a Park. The City Is Building a Data Center

Developers of OpenAI’s Stargate Data Center Face Higher Costs

Towards Serverless Semi-Decentralized Federated Learning with Heterogeneous Optimizers

SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling

Explaining Unsupervised Disease Staging in Huntington's Disease: Insights into Model Representations and Clusters

Unsupervised Continual Clustering via Forward-Backward Knowledge Distillation

Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations

Clustering 3x Jetson Nano Orin Supers

Google to pay SpaceX $920M a month for compute capacity at xAI data centers

"We pissed off a lot of people": Giant data center plan cut 50% amid protests

How do you identify researchers who are good? [D]

AirTrunk commits $30B to build 5GW of AI data centers in India

Staged Factorial Screening for Budget-Constrained Micro-Pretraining

A Machine Learning-Based Framework for Discovering Huntington's Disease Stages: Integrating Graph Representation Learning and clustering to Uncover Progression Dynamics in Longitudinal Enroll-HD Dataset

Data Center Developer Switch in Talks to Raise Billions at $50 Billion-Plus Valuation

Data Center Developer Switch in Talks to Raise Billions at $50 Billion-Plus Valuation

Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]

Meta steals a tactic from Tesla and builds data centers in tents

Qwen 3.6 27B 30GB Same top p: 98.358 ± 0.033 % vs UD Q8 K XL 33GB Same top p: 97.426 ± 0.041 %

How some data center operators are tackling their water use problems

Fusion Startup Helion Nearly Triples Valuation to $15.5 Billion in Thrive-led Round

Contrastive Learning and Correlation Clustering for Sequences of Network Telescope Data

Arithmetic Pedagogy for Language Models

A Nonmonotone Gradient-Based Algorithm for Symmetric Nonnegative Matrix Factorization and Graph Clustering

KForge: LLM-Driven Cross-Platform Kernel Generation for AI Accelerators

Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models

I Put a Datacenter GPU in My Gaming PC for £200

SK Hynix to Double Capacity as AI Strains Memory Supply

PE-means: Improved Differentially Private $k$-means Clustering through Private Evolution

French parsing enhanced with a word clustering method based on a syntactic lexicon

Agentic Clustering: Controllable Text Taxonomies via Multi-Agent Refinement