Tag

Hardware

280 articles archived under #hardware · RSS

r/LocalLLaMA community 13d ago

EvoTensile: Evolutionary algorithms for AMD Tensile GEMM kernel tuning

There has been an effort to tune kernels in hipBLASLt so the most basic matmuls can run faster. It's known that on Strix Halo (gfx1151), GEMM with NN and TN input layouts (used in inference) are already well-tuned, while NT and TT layouts (used in training) are not yet tuned.…

8
arXiv — Machine Learning research 13d ago

Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland

arXiv:2606.20034v1 Announce Type: new Abstract: Understanding urban spatial morphology is critical for climate modeling, risk assessment, and sustainable urban design, and Local Climate Zone (LCZ) mapping provides the basic framework for this. However, many cities still use…

10
arXiv — NLP / Computation & Language research 13d ago

Clusters are All You Need: Pre-Training the Tsetlin Machine with Semantic Clusters from Language Models for Interpretability

arXiv:2606.19815v1 Announce Type: new Abstract: Pre-trained language models such as BERT achieve strong text classification performance but lack transparency, limiting their use in high-stakes settings. The Tsetlin Machine (TM) offers fully interpretable, clause-based reasoning…

25
arXiv — NLP / Computation & Language research 13d ago

TransLaw: A Large-Scale Dataset and Multi-Agent Benchmark Simulating Professional Translation of Hong Kong Case Law

arXiv:2507.00875v3 Announce Type: replace Abstract: Translating Hong Kong Court Judgments from English to Traditional Chinese is mandated by Articles 8-9 of the Basic Law, yet remains constrained by a shortage of parallel resources and rigorous demands on legal terminology,…

38
arXiv — NLP / Computation & Language research 13d ago

ShoppingBench: A Real-World Intent-Grounded Shopping Benchmark for LLM-based Agents

arXiv:2508.04266v4 Announce Type: replace Abstract: Existing benchmarks in e-commerce primarily focus on basic user intents, such as finding or purchasing products. However, real-world users often pursue more complex goals, such as applying vouchers, managing budgets, and…

22
Hacker News — AI on Front Page community 13d ago

Show HN: Are You in the Weights?

With more traffic moving off-web and into LLMs, I got curious about what traces we leave "in the weights". My design partner and I built a site in the past few weeks that checks recognition across frontier and small models. It queries many of them in parallel, clusters the…

37
TechCrunch — AI news-outlet 13d ago

Amazon hopes to challenge Nvidia more directly by selling its AI chips

AWS is in talks to sell its chips to other data centers. CEO Andy Jassy has said this represents a $50 billion opportunity for the company.

37
TechCrunch — AI news-outlet 13d ago

AI data centers just got a government-mandated fast lane to the grid

FERC told grid operators to give data centers a fast lane for interconnections, but it failed to address electricity supply shortages.

30
arXiv — Machine Learning research 14d ago

scGTN: Deep Siamese Graph Transformer Network for Single-cell RNA Sequencing Clustering

arXiv:2606.18672v1 Announce Type: new Abstract: Single-cell RNA sequencing (scRNA-seq) serves a pivotal role in characterizing gene expression at the cellular level, enabling the identification of cell types and advancing the understanding of cellular heterogeneity. Despite the…

25
arXiv — Machine Learning research 14d ago

Online Distributional Prediction via Latent Cluster Geometry Under Drift and Corruption

arXiv:2606.18778v1 Announce Type: new Abstract: Online learning in non-stationary streams is often formulated as tracking a point estimate, but many applications require predicting the full data-generating distribution. We study online distributional prediction under drift and…

7
arXiv — Machine Learning research 14d ago

Seed-Guided Semi-Supervised Clustering by A-Contrario Anomaly Detection

arXiv:2606.18833v1 Announce Type: new Abstract: This paper introduces a semi-supervised clustering framework grounded in the statistical duality between grouping principles and anomaly detection. We address the challenge of robust cluster definition in noisy environments -- a…

38
arXiv — Machine Learning research 14d ago

FoMoE: Breaking the Full-Replica Barrier with a Federation of MoEs

arXiv:2606.19025v1 Announce Type: new Abstract: Pre-training Large Language Models (LLMs) typically demands large-scale infrastructure with tightly coupled hardware accelerators. While increasing model and dataset scale remains the dominant driver of performance,…

9
r/LocalLLaMA community 14d ago

GLM 5.2 Release Video [Made with GLM 5.2]

Everyone's probably seen the remotion thing that went viral a couple months back with CC. Its basically that with GLM 5.2 as the model provider. Close to Fable but still a step below on creativity, top is still Gemini 3.1 pro for vid creation but at least I can see why Design…

21
r/MachineLearning community 14d ago

Contrastive targeted SFT as a mechinterp method - has anyone mapped causal dependency interactions this way? [D]

Hi All, I've been running experiments on targeted SFT for specific capability dimensions on a 31B model. After running small training run to prime the model slightly in the direction I want, then ran a judge across 40 domains scoring six independent quality dimensions. One…

21
r/LocalLLaMA community 14d ago

GLM-5.2 is a win for local AI

I know GLM 5.2's massive 753B footprint means none of us are running it at home without an enterprise cluster, but having a true frontier-level, MIT-licensed coding agent out in the wild makes me optimistic. The distillation potential here is massive. Once the community starts…

38
TechCrunch — AI news-outlet 14d ago

Canadian pension giant joins race to fund India’s AI-fueled data center boom

The Canadian pension giant will acquire an 8.2% stake in CtrlS, a tech giant that operates more than 15 data centers across India.

8
r/LocalLLaMA community 15d ago

Local models went from mostly useless to actually useful really fast. What changed?

https://preview.redd.it/knc4ht7bft7h1.png?width=1048&format=png&auto=webp&s=49abdb8b0f358e799ecb06aa49134d9b0fd49336 Mitchell Hashimoto had a good point earlier: local models went from basically useless to actually useful in what feels like one year. I think thats pretty…

5
arXiv — Machine Learning research 15d ago

C2FL: Clustered Continual Federated Learning under Spatial and Temporal Drift

arXiv:2606.18003v1 Announce Type: new Abstract: Collective Adaptive Systems (CAS) increasingly rely on machine learning to let each node learn from locally sensed data, aligning its behavior with the surrounding environment. Scaling this intelligence, however, raises fundamental…

8
Ars Technica — AI news-outlet 15d ago

Trump admin tries to block Clean Air Act lawsuit over xAI's gas turbines

NAACP lawsuit says xAI uses gas turbines without permits for Grok data center.

19
NVIDIA Developer Blog official-blog 15d ago

How to Optimize Transformer-Based Models for Low-Precision Training

Transformer architectures are the backbone of many modern large language and generative AI models. As these models grow in size, training runs consume more GPU...

5
MIT Technology Review — AI news-outlet 16d ago

Want to get a data center online quickly? Give it some flex.

At the end of a tense and scoreless first half of a soccer match between the English men’s team and rival Germany, millions of Brits let out a collective sigh and did what they so often do in moments of stress: They made tea. That wave of electric kettles clicking on, however,…

26
arXiv — Machine Learning research 16d ago

Distilling Drifting Transformers with Representation Autoencoders

arXiv:2606.15553v1 Announce Type: new Abstract: Representation Autoencoders (RAEs) have improved diffusion and flow models by semantically richer latent space owing to the strongly label-wise clustered DINO features in the pretrained encoders. Yet in the distillation stage, the…

15
r/LocalLLaMA community 16d ago

"My son is a genius coder" - honest Alpha Tester review

"It's not slop - it's an art" - Grandma. Introducing you my few weeks brainstorming and writing of the code. I was rewrite everything my AI was creating so basically it's my own creation. Brain Calculator Pro™ — the calculator that made your calculator obsolete. AI-powered…

11
r/MachineLearning community 16d ago

Could AI training be decentralized like Bitcoin mining? [D]

I’ve been thinking about whether the same basic concept behind Bitcoin could be applied to AI training. In Bitcoin, miners perform proof-of-work and are rewarded for contributing computational resources to secure the network. The actual computation itself isn’t particularly…

15
The Information — AI news-outlet 16d ago

Nvidia’s Share of AI Inference Chip Market Appears to Be Rising

As AI developers and cloud providers have launched server chips to lessen their dependence on Nvidia’s, some analysts and executives at these firms expected the chips to eat into Nvidia’s market share. That doesn’t seem to be happening. Nvidia has actually increased its share of…

4
r/LocalLLaMA community 16d ago

Buying AI accelerators/GPUs in China...

Bit of a long-shot this, but happens I'll be in China next week. Just wondering if there are any Chinese graphics cards/AI accelerators I should be trying to buy when I'm there? :-). I would be looking for something that let me run inference big models (so, lots of (V?)RAM), but…

10
The Information — AI news-outlet 16d ago

Exclusive: Nvidia Server Marketplace Startup Raises $100 Million at $800 Million Valuation

Data center software startup and AI-server broker Hydra Host has raised $100 million at a valuation of close to $800 million, led by Kindred Ventures. Nvidia, Cathie Wood’s ARK Invest, early CoreWeave backer Magnetar, and existing investors Founders Fund and Flume Ventures also…

26
arXiv — Machine Learning research 17d ago

Learning Urban Access Costs from Origin-Destination Flows via Inverse Optimal Transport

arXiv:2606.14157v1 Announce Type: new Abstract: Cities deliver basic services through mixed public-private facility networks, including schools, clinics, transit providers, and subsidized service points. In these systems, planners often observe where households go, but not the…

9
Ars Technica — AI news-outlet 19d ago

$130 billion in data center projects blocked by protests so far this year

Winning fight against AI data centers gives people a "taste of political power."

6
Ars Technica — AI news-outlet 19d ago

When it comes to total water use, AI data centers are a drop in the bucket

Even moderately sized data centers can have an outsized local impact.

23
The Information — AI news-outlet 19d ago

Meta Bought Rivos to Accelerate Its AI Chip Push. It Isn’t Working.

Meta Platforms bought semiconductor startup Rivos last year to accelerate development of in-house chips and reduce its reliance on Nvidia as it pours cash into data centers for its AI ambitions. Now six months since the acquisition closed, Meta is struggling to make it work,…

10
The Information — AI news-outlet 20d ago

Nvidia Pitches Vera CPU to Chinese Customers

Nvidia is pitching Chinese customers on its new Vera central processing units for AI data centers, telling them the chips could be available as soon as August and that orders can begin now, Reuters reported, citing three people familiar with the matter. The push gives Nvidia…

7
Hugging Face Daily Papers research 20d ago

Flash-GMM: A Memory-Efficient Kernel for Scalable Soft Clustering

Abstract Flash-GMM introduces an efficient fused Triton kernel for Gaussian Mixture Models that achieves significant speedup and enables processing much larger datasets on a single GPU. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We present Flash-GMM, a fused Triton kernel for…

18
The Information — AI news-outlet 20d ago

KKR, Nvidia, Others Launch $10 Billion Data Center Company

Private equity firm KKR, the Kuwait Investment Authority, Nvidia and power generation company Vistra launched a new company on Thursday to finance and help build AI data centers. Nvidia’s role as an anchor investor in Helix signifies another extension of the AI giant’s growing…

29
The Information — AI news-outlet 20d ago

Anthropic Pursues First Data Center Leases, Seeks Financial Backing From Google

Anthropic is moving forward with a plan to control its own servers for developing AI, giving it the ability to cut its computing costs in the long run. The maker of Claude in recent months has signed more than a dozen initial agreements, known as letters of intent, to lease data…

20
r/LocalLLaMA community 20d ago

How I implemented ASR bias for voice transcription models [Open Source]

I've been spending the last couple of weeks building a Wispr Flow clone as an open source project. For context, it is a voice dictation app that lets you type faster, by speaking instead of actually typing. I spent the first week building the basic STT capabilities. One of the…

29
r/LocalLLaMA community 21d ago

Tiny Scale Is All I Can Spare To Play With Transformer

Hi! I am a student from India, this is my first paper that I published. I was curious whether I can combine both Attention and FFN together to save parameters without sacrificing performance, specifically at parameters <= 10M. Basically my intuition was that Attention is dynamic…

32
arXiv — Machine Learning research 21d ago

Mirror Descent Beyond Euclidean Stability: An Exponential Separation in Initialization Sensitivity

arXiv:2606.11431v1 Announce Type: new Abstract: Mirror Descent (MD) extends Gradient Descent (GD) beyond Euclidean geometry and has recently reappeared as a lens for KL-regularized policy optimization in reinforcement learning and LLM post-training. This raises a basic…

10
arXiv — Machine Learning research 21d ago

Efficient Time Series Clustering from Multiscale Reservoir Dynamics with Granular-Ball Anchoring Graph Optimization

arXiv:2606.12077v1 Announce Type: new Abstract: Time-series clustering remains challenging due to the inherent trade-off between clustering effectiveness and computational efficiency. Similarity-based methods often suffer from quadratic complexity caused by pairwise distance…

15
arXiv — Machine Learning research 21d ago

Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

arXiv:2606.12138v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) are widely used to interpret neural network representations, but their utility depends on whether the learned features are reproducible across training runs. We study this question through \emph{feature…

19
arXiv — NLP / Computation & Language research 21d ago

Small Experiments, Cheaper Decisions: A Case Study in Staged Promotion for Micro-Pretraining

arXiv:2606.11387v1 Announce Type: new Abstract: Short pretraining runs can reduce experimental cost, but they can also over-promote configurations that only look strong at tiny budgets. We study an auditable staged-promotion protocol for a fixed micro-pretraining runner on two…

9
r/LocalLLaMA community 21d ago

Tried to benchmark Google’s new on-device dictation models (Eloquent) and basically couldn’t

I tried to benchmark Google’s new on-device dictation app (Eloquent) and basically couldn’t. It drops about half of my dictations. tl;dr Full results are 👉 here . Background: Google shipped a new fully‑local dictation app yesterday with proprietary new models , so I was excited…

5
Hacker News — AI on Front Page community 21d ago

Farmer donates land for a park, city sells it for $10M as data center land

Article URL: https://www.tomshardware.com/tech-industry/farmer-donates-land-for-a-park-city-sells-it-for-data-center-development-usd10-gift-became-usd10m-for-city-government-with-usd30m-tax-expected-over-next-decade Comments URL: https://news.ycombinator.com/item?id=48481126…

32
NVIDIA Developer Blog official-blog 21d ago

Designing Production-Ready Battery Energy Storage Systems for AI Factories

AI factories are changing what data-center infrastructure must do. Unlike traditional data centers, AI factories are built to manufacture intelligence at scale....

29
TechCrunch — AI news-outlet 21d ago

The three hard-tech moonshots fueling SpaceX’s unbelievable IPO

Most of the value in SpaceX's IPO is effectively a call option on the company's ambitious space data center plans.

23
OpenAI official-blog 21d ago

PRC-linked influence operations are targeting AI debates in the US

A new report from OpenAI details PRC-linked influence operations using AI to target U.S. tech debates, data center narratives, tariffs, and false claims about ChatGPT.

7
r/MachineLearning community 21d ago

Should I Commit and Publish the Results? [R]

Hello Reddit I've been working on QSPR (Quantitative Structure-Property Relationship) analysis for chemical compounds mentioned in the Jean-Claude Bradley Open Melting Point Dataset . Basically the idea is to see how accurate a model can predict melting points of compounds using…

32
TechCrunch — AI news-outlet 22d ago

Meta signs first AI data center deal in India with Reliance

The 168-megawatt facility will support Meta's global AI computing needs and can be expanded over time.

17
arXiv — Machine Learning research 22d ago

FailureScope: Cross-Regime Behavioral Diagnosis of Language Model Weaknesses

arXiv:2606.09878v1 Announce Type: new Abstract: Standard benchmarks report aggregate accuracy, but practitioners need to know which specific capabilities a model lacks. We introduce FailureScope, a behavioral-diagnosis method that clusters evaluation probes by their cross-model…

20
arXiv — Machine Learning research 22d ago

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

arXiv:2606.09924v1 Announce Type: new Abstract: Deploying deep neural networks on memory-constrained edge accelerators is bottlenecked by per-inference off-chip weight transfer rather than computation: the dense network cannot be retained on-chip, and every parameter must be…

29

EvoTensile: Evolutionary algorithms for AMD Tensile GEMM kernel tuning

Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland

Clusters are All You Need: Pre-Training the Tsetlin Machine with Semantic Clusters from Language Models for Interpretability

TransLaw: A Large-Scale Dataset and Multi-Agent Benchmark Simulating Professional Translation of Hong Kong Case Law

ShoppingBench: A Real-World Intent-Grounded Shopping Benchmark for LLM-based Agents

Show HN: Are You in the Weights?

Amazon hopes to challenge Nvidia more directly by selling its AI chips

AI data centers just got a government-mandated fast lane to the grid

scGTN: Deep Siamese Graph Transformer Network for Single-cell RNA Sequencing Clustering

Online Distributional Prediction via Latent Cluster Geometry Under Drift and Corruption

Seed-Guided Semi-Supervised Clustering by A-Contrario Anomaly Detection

FoMoE: Breaking the Full-Replica Barrier with a Federation of MoEs

GLM 5.2 Release Video [Made with GLM 5.2]

Contrastive targeted SFT as a mechinterp method - has anyone mapped causal dependency interactions this way? [D]

GLM-5.2 is a win for local AI

Canadian pension giant joins race to fund India&#8217;s AI-fueled data center boom

Local models went from mostly useless to actually useful really fast. What changed?

C2FL: Clustered Continual Federated Learning under Spatial and Temporal Drift

Trump admin tries to block Clean Air Act lawsuit over xAI&#039;s gas turbines

How to Optimize Transformer-Based Models for Low-Precision Training

Want to get a data center online quickly? Give it some flex.

Distilling Drifting Transformers with Representation Autoencoders

"My son is a genius coder" - honest Alpha Tester review

Could AI training be decentralized like Bitcoin mining? [D]

Nvidia’s Share of AI Inference Chip Market Appears to Be Rising

Buying AI accelerators/GPUs in China...

Exclusive: Nvidia Server Marketplace Startup Raises $100 Million at $800 Million Valuation

Learning Urban Access Costs from Origin-Destination Flows via Inverse Optimal Transport

$130 billion in data center projects blocked by protests so far this year

When it comes to total water use, AI data centers are a drop in the bucket

Meta Bought Rivos to Accelerate Its AI Chip Push. It Isn’t Working.

Nvidia Pitches Vera CPU to Chinese Customers

Flash-GMM: A Memory-Efficient Kernel for Scalable Soft Clustering

KKR, Nvidia, Others Launch $10 Billion Data Center Company

Anthropic Pursues First Data Center Leases, Seeks Financial Backing From Google

How I implemented ASR bias for voice transcription models [Open Source]

Tiny Scale Is All I Can Spare To Play With Transformer

Mirror Descent Beyond Euclidean Stability: An Exponential Separation in Initialization Sensitivity

Efficient Time Series Clustering from Multiscale Reservoir Dynamics with Granular-Ball Anchoring Graph Optimization

Unstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse Autoencoders

Small Experiments, Cheaper Decisions: A Case Study in Staged Promotion for Micro-Pretraining

Tried to benchmark Google’s new on-device dictation models (Eloquent) and basically couldn’t

Farmer donates land for a park, city sells it for $10M as data center land

Designing Production-Ready Battery Energy Storage Systems for AI Factories

The three hard-tech moonshots fueling SpaceX&#8217;s unbelievable IPO

PRC-linked influence operations are targeting AI debates in the US

Should I Commit and Publish the Results? [R]

Meta signs first AI data center deal in India with Reliance

FailureScope: Cross-Regime Behavioral Diagnosis of Language Model Weaknesses

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Canadian pension giant joins race to fund India’s AI-fueled data center boom

Trump admin tries to block Clean Air Act lawsuit over xAI's gas turbines

The three hard-tech moonshots fueling SpaceX’s unbelievable IPO