Unitat de processament de tensors

Unitat de processament de tensors (o TPU, acrònim anglès de Tensor Processing Unit) són un tipus de circuit integrat dissenyats amb un propòsit específic o ASIC. En aquest cas el seu propòsit o funcionalitat de la TPU és l'aprenentatge automàtic de màquines. La primera TPU ha estat dissenyada per l'empresa Google i va ser anunciada el 2016. El camp d'aplicació de les TPU són la robòtica i la intel·ligència artificial.^[1]^[2]^[3]

La raó per la qual les TPU hagin superat en prestacions dels millors processadors genèrics i GPU (processadors gràfics) és augmentar la capacitat de processament en paral·lel tot transformant l'estructura de 64 bits a 8 bits.^[4]

Arquitectura

La TPU és una màquina (autòmat en forma de circuit integrat) de multiplicar dades de 8 bits en estructura de matriu. El processador funciona amb instruccions de tipus CISC. Fabricada amb tecnologia de 28 nanòmetres i amb una mida del dau del circuit integrat de 662 mm², la freqüència de rellotge és de 700 MHz amb un consum de potència de 28-40 W. Té una memòria de 28 MB i 4 MiB en registres interns.^[5]

Productes

Tensor Processing Unit : ^[6]

	TPUv1	TPUv2	TPUv3	TPUv4	TPUv5e	TPUv5p	v6e (Trillium)

Data introducció	2015	2017	2018	2021	2023	2023	2024
Node	28 nm	16 nm	16 nm	7 nm	?	?
Tamany del dau (mm²)	331	< 625	< 700	< 400	300-350	?
Memòria interna (MiB)	28	32	32	32	48	112
Rellotge (MHz)	700	700	940	1050	?	1750
Memòria	8 GiB DDR3	16 GiB HBM	32 GiB HBM	32 GiB HBM	16 GB HBM	95 GB HBM	32 GB
Amplada de banda	34 GB/s	600 GB/s	900 GB/s	1200 GB/s	819 GB/s	2765 GB/s	1640 GB/s
TDP (W)	75	280	220	170	?	?
TOPS (Tera Operacions Per Segon)	23	45	123	275	197 (bf16) 393 (int8)	459 (bf16) 918 (int8)	918 (bf16) 1836 (int8)
TOPS/W	0.31	0.16	0.56	1.62	?	?

Comparativa

Família de CPU	FLOPS	Any
AMD ATI RADEON HD4800	1 teraFLOPS	2008
Intel Core I7 980 XE	109 gigaFLOPS	2010
Nvidia Tesla GPU	515 gigaFLOPS	2010
Google TPU	92 teraFLOPS	2017

Vegeu també

Arquitectures CISC, RISC-V
Aprenentatge automàtic.
Robòtica
Intel·ligència artificial
FLOPS: quantitat d'operacions en coma flotant per segon.
Programari TensorFlow.

Referències

↑ «Google's Tensor Processing Unit explained: this is what the future of computing looks like» (en anglès). TechRadar, 09-04-2017.
↑ «Google Calls for Switch Chip API | EE Times» (en anglès). www.eetimes.com. [Consulta: 9 abril 2017].
↑ «First In-Depth Look at Google’s TPU Architecture» (en anglès). www.nextplatform.com, 05-04-2017. [Consulta: 9 abril 2017].
↑ «TPU Beat Intel, Nvidia, says Google | EE Times» (en anglès). www.eetimes.com. [Consulta: 9 abril 2017].
↑ «ISCApaperv3 (2).pdf». Google Docs, 09-04-2017. Arxivat de l'original el 2017-07-03 [Consulta: 9 abril 2017].
↑ «So what is a Tensor processing unit (TPU) and why will it be the future of Machine Learning?» (en anglès americà), 07-08-2024. [Consulta: 24 novembre 2024].

[1] «Google's Tensor Processing Unit explained: this is what the future of computing looks like» (en anglès). TechRadar, 09-04-2017.

[2] «Google Calls for Switch Chip API | EE Times» (en anglès). www.eetimes.com. [Consulta: 9 abril 2017].

[3] «First In-Depth Look at Google’s TPU Architecture» (en anglès). www.nextplatform.com, 05-04-2017. [Consulta: 9 abril 2017].

[4] «TPU Beat Intel, Nvidia, says Google | EE Times» (en anglès). www.eetimes.com. [Consulta: 9 abril 2017].

[5] «ISCApaperv3 (2).pdf». Google Docs, 09-04-2017. Arxivat de l'original el 2017-07-03 [Consulta: 9 abril 2017].

[6] «So what is a Tensor processing unit (TPU) and why will it be the future of Machine Learning?» (en anglès americà), 07-08-2024. [Consulta: 24 novembre 2024].

[1]

[2]

[3]

[4]

[5]

[6]