Notícia publicada por brain em julho 21, 2004 10:45 PM
| TrackBack
Após os desencontros relatados na notícia de ontem sobre o Kernel Summit, hoje consegui contato com o Arnaldo Carvalho de Melo, brasileiro que é um dos palestrantes no evento que reune os principais participantes do evento.
A meu convite, ele topou compartilhar conosco suas anotações resumidas sobre o que está rolando por lá, e conceder uma entrevista ao final do evento sumarizando tudo o que tiver acontecido. Atendendo a pedidos, ele também vai ajudar a identificar os desenvolvedores mostrados na foto publicada em nossa notícia de ontem.
Devido à própria natureza do evento, a maior parte das notas do Arnaldo está em inglês, e o conjunto também inclui fragmentos dos slides. Note que elas foram escritas para uso pessoal dele! Optei por publicar apenas com uma leve edição, pois a parte que poderia se perder na tradução pode ser do interesse de eventuais desenvolvedores que estejam em busca de informações "de ponta" sobre os planos para o kernel do Linux. Se algo não estiver suficientemente claro, comente, e tentaremos esclarecer posteriormente, na entrevista. As notas de hoje incluem, entre outros tópicos, comentários sobre as palestras da Intel, AMD, IBM e dos desenvolvedores de recursos como Hotplug de memórias e CPUs, Suspend to Disk e outras. Como complemento, o Arnaldo sugere conferir a coertura do LWN.net, cujo fundador também é um dos palestrantes do evento.
Segue o texto do Arnaldo:
Cobertura para o BR-Linux
---------------------------------------------------------------------------
Abertura - Ted T'so
Uma minoria (5 ou 6 pessoas) preferiu não ter o Kernel Summit gravado e
disponibilizado, um anônimo preferia que os executivos de sua empresa não
soubessem sobre o que ele iria falar a respeito de produtos/práticas/whatever,
ano que vem, talvez.
---------------------------------------------------------------------------
Intel CPU Architecture - Frank Binns - Desktop Processor Group Architecture
CPU Roadmap, Multi-core topologies, Lagrande Technology
---
EMT64, NX Bit, HT, PCI Express - Intel "Alternative" to the AMD64 (Risos)
Question (jejb): EMT64 has IOMMU like AMD64? Não, vamos conversar sobre como
colocar isto no processador, "No, it is not in the processor" :-)
Based on the P4 core, suporte a memory hotplug
Two cores, 4 threads (HT), 1066 MHz FSB
---
IPF (Itanium) recebendo um core, single threaded, IA-32EL (antes era emulado)
Multicore Architecture Extension
Intel Montecito Processor -> "Breakthrough Performance for Enterprise and
HPC" - Dual Core, 90nm process technology, Targeting 24 MB L3 Cache, not shared
Franks Prediction: in some years systems with 4 cores, 8 threads, multilevel,
hopefully doesnt goes to L7 (Risos)
cache hierarchies.
CPUID fornece informações sobre número de threads, se compartilham caches, etc.
----
Vulnerabilities of the PC Today
Ring 0 pode fazer tudo, kernel roda nele, evitar exploits
CPU Extensions:
- Enables domain separation
- Sets policy for protected memory
intel cpu
|
Protected <------------ intel (G)MCH ----> RAM
Graphics |
- Trusted channel |
between graphics |
and trusted Software |
ICH -> LPC -> TPM (Trusted Platform Module)
| protected keys, digital certs,
Protected <----+ attestation certificates
Keyboard Mouse USB
trusted channel between
kbd/mouse and trusted software
LT Protection Model
Kernel provides protected partition services
---------------------------------------------------------------------------
AMD
AMD64 status report for kernel summit - Rich Brunner, AMD Fellow
Agradeceu aos colegas da IBM e Intel por usar o Kernel Summit para liberar
informações sobre novas tecnologias.
Linux has become a proving ground for 64-bit technology
Significant year for AMD, seeing AMD64 hit the road, agradeceu ao Andi, Andrea,
Dave Jones, etc.
Mais de 15 distribuições e projetos comunitários suportam o AMD64, e mesmo
nossos competidores, a Intel (Risos), suportam esta tecnologia.
Não somente para o mercado de servidores, para desktops também.
AMD suportando FOSS orgs e conferencias com dólares e hardware:
- OSDL, FSG, GCC conference, OKS/OLS
AMD continua com sua tendência de fornecer publicamente informações técnicas
para a comunidade técnica.
A Word from Our Sponsors (Eles estão pagando por um almoço/jantar por eu
ter colocado este slide (Risos) ).
Dual core processor tech allows AMD to continue to offer
a competitive performance roadmap while meeting the system
architecture demands of our customers
Processors based on AMD64 with Direct Connect Architecture were designed
from the start to add a second core
Dual core processor will be in AMD's forthcoming 90 nm process
expected availability in mid-2005
one die with 2 cpu cores
physical chip/package plugs into a socket on the motherboard
populated socket contains a chip/package with a number of integrated
cores
cpu numbering scheme uses LSBs of initial APIC ID to distinguishj cores in
one processor package
- example: 2 packages populate 2 sockets and 2 cores per package/socket
Sempre possível ter o kernel não usar o segundo core (performance reasons) (?,
check this out...)
AMD suggests to deveolpers taht software be _licensed based on number of
populated sockets, scheduled based on number of cores)
for certain workloads, since N integrated cores share chip I/O & Resources,
there could be less performance than N discrete cores
. So why charge the same?
- Focusing on sockets and not cores reduces end-user confusion
. So how can software distinguish sockets from cores and
do the right thing?
. unique initial APIC ID assigned to all cores
- ACPI-MADT and MPS tables record all cores just like discrete case
. New extended CPUID function returns on any core the number of cores in the associated socket
. great for new software to use to figure number of cores
. but legacy software doesn't know about new CPUID function.
It understands only 2 models:
- discrete SMP and SMT/hyperthread
. So cpuid (eax=1) on all cores in a dual core pkg returns:
- CPUID.HTT=1 (edx[28])
. the fact that this may appear as HT to legacy software is just a
happy co-incidence
- CPUID.logical_number_of_processors = 2 (EBX[23:16])
- new extended CPUID feature bit, HTVALID, which tells new
software if the HTT fields above report hyperthreading
. HTVALID will be zero on AMD dual core indicating no HTT support
. legacy software support for 2-logical cores, while more restrictive,
appears to work equally fine for 2-physical cores
- hyperthreading schedulling rules work fine for multi-core
- hyperthreading shared MSR rules works fine for multi-core
- AMD evaluation of legacy software has thuis far not found any major
problems with this assumption
More details for Break-out session
H. Peter Anvin sugeriu algumas coisas sobre como estruturar os bits do CPUID,
Rich falou que como isto tudo é feito na BIOS, a idéia poderia ser implementada,
mas que isto deveria ser discutido no BOF, a AMD veio no Kernel Summit justamente
para ouvir estas idéias e discuti-las, eventualmente absorvendo-as.
----------------------------------------------------------------------------
IBM Power5 systems and Linux
Dr. Balaram Sinharoy
POWER5 Chief Scientist
IBM Systemas and Technology Group
600 pessoas nos LTC, Linux is big within IBM.
Power5 existem dois cores, cada core tem SMT (no more glue needed to communicate with other processors, Enhanced Distributed Switch direto na CPU)
1.9 MB shared L2 on chip
suporte a acesso direto a cache L3 de 36MB no processador
First Dual Core SMT processor
- 8way superscalar SMT cores
. chip has 276 milhoes de transistores, 389 mm2 die
- 1.9 MB L2 care - point of coherency
- on chip l3 directory, memory controller
- Technology
. 130 mm lithography, SOI
. Cu wiring, 8 layers of metal
- High-speed elastic bus interface
. I/Os: 2313 signal, 3057 power
Simultaneous Multi-threading in Power5
each chip appears as a 4way SMP to software
- processor resources optimized for enhanced SMT performance
. software controlled thread priority
. dynamic feedback of runtime behaviour to adjust priority
. dynamic switching between signle and multithreaded mode
In single threaded mode the thread has better performance than in SMT
2 branch prediction mechanism and on selector mechanism to choose from those
2 bp mechanism
120 registradores, mapeamento entre registradores
Power management is the most important issue in modern processors
Mostrou fotos dos processadores em single thread, SMT com e sem o Dynamic power
management, os sem estavam quase tostando (amarelos), os com DPM estavam
escuros :)
Linux task scheduler on POWER6:
. understands system structure (DCM, MCM), on note vs off-node access cost and dispatches tasks inteligenlty
. memory affinity based schedulling keeps tasks local for fast access to 36MB
l and local memory at high Bandwidth
. schedule task to the same processor chip/MCM where it ran earlier to reduce communication to another chip or another MCM
- SMT advantages
. SMT mode has better instructoion throughput per core, but lower single thread performance -> dispatch tasks to the processor in ST mode first and then change the cores to SMT-mode todispatch mode tasks
. thread priority:
- lower the thread priority (kernel or applications") when in spin-lock to give more processing cycvles to the other thread
- many other things based on thread priority
. SMT performance gain is typically 40%
. Memory and cache performance
- lunux understands sstem topology and allocates memory on the same node where the task is running
. Lines stripped across multiple memory controller on MCM -> high bandwitch for burst data access
. speculative memory access before cache intervention -> fast access to global kernel data, heavility threaded apps
- per-cpu local kernel data is obtained fast from local caches and memory
- data prefetching in kernel code is useful
- fast cache-tocache data transfer with the closest cache sourcing data
. coherent ICache: no need to flush an executable page from icache before giving the page out to the user
- Low power mode
- when no tasks to run, put a thread first at every low priority and then to sleep mode -> significant power saving
Virtualization: virtual IO, virtual SCSI and virtual Ethernet
. Hardware lab bring up advantages of linux
- linux allows rapid prototyping to verify new functions, fast performance tradeof analysis
. performance tuning knobs in the hardware can be optimized quickly in the lab
- quick validation that high percentage of the memory and IO bandwidth can actually be sustained
- Future opportunities
. how to schedule tasks on SMT cores and multi-core chips with runtime power and performance feedback
. jow to tune the processor and memory sbsustem at runtime based on workload characteristics
. cooperative threads for improving single thread performance
2001 - dynamic LPAR (16)
2002-3 - 32 LPAR
- Server virtualization: (up to 254 partitions per SMP)
. UP to 10 partitions per processor -> total 1280 logical partitions (128 threads x 10 partitions)
REcently announced for Linux, AIX and OS/400 (i5/OS) based systems
Chips are 95mm x 95 mm == um diskette de 3 1/4
Linux first on the Power5 than AIX? (risos) -> parece que sim, verificar
----------------------------------------------------------------------------
Memory Hotplug
(um Japonês - check later, VA Linux Systems - Japan)
- Separating freeable/nonfreable memory
- Define fictious ZONE_HOTREMOVABLE to logically separate highmem
and hotremovable notions (eg. for 64bit archs)
memsection/pernode
. Disable allocation from the removing area
. should work with memsection (hugepage dynamic alloc code can be used)
. remap(migrate) operation overview
Without blocking, other accesses can get in the way.
. blocks memory access using locket !uptodate pages
. wait until nothing is referencing the page
. copy and finish
. (a bit more for anon memory)
not as simple as I imagined first (eg. unwinding), but this is the least ugly
status - functional, including hugepage
acme: isto pode ser simulado bootando a máquina com, digamos 3 pentes de
memória quando a máquina tem 4 e simulando as indicações de conexão hotplug de
memória e dai usando o pente que sobrou :)
Linus: se você trabalha para uma empresa de hardware, convença-os a fazer o
remap de memória em hardware, resolve tudo, o pente de memória fica numa
zona separada :) O Linus acha que o caso comum é colocar mais memória, mas
existem casos
----------------------------------------------------------------------------
Hotplug CPU
Rusty Russell (IBM)
Pavel Machek precisa deste código para o software suspend. Rusty: the
infrastructure is there. Most of the planet will be smp in 6 months (SMT,
SMP with multiple cores). Desligar CPUs quando nãda ocorre (save power luke),
voltar quando trabalho existe, etc. Suspend tem que lidar com o caso da
última CPU, i.e. salvar estado, etc.
"Read the code!" Rusty -> applause :)
----------------------------------------------------------------------------
VM (Scrap this one)
Hugh Dickins - Veritas
What is Page Clustering: minimum building block from physical to virtual,
usado também para alocação de memória.
----------------------------------------------------------------------------
Suspend to Disk
- Pat Mochel
(Three) Two implementations
- pmdisk and swsusp merged
- non-technical issues resolved - 1 step closer to ending world hunger
- Nigel Cuningham's "swsusp 2" patches - pros & cons
Works on many systems (On all systems tested)
Many drivers still lacking support
- need better way to track progress of driver support
Many new recent features
- highmem support
- SMP support
Linus: the ones writing the code make the decisions, too much flames over
how it should be from people not contributing patches.
Linus: quantas pessoas aqui estão usando swsusp? somente umas 5 pessoas
levantaram a mão... I.e. não adianta ficar discutindo como deve ser, isto
é "decisão por comitê", i.e. não funciona, mais pessoas tem que testar e
quem escreve o código decide como será.
Ted T'so: o que falta para fazer o merge do swsusp2? Separar o patch
em pedaços menores, documentando e submetendo para inclusão, o Nigel ainda
não fez isto.
Future?
. handle user/kernel processes separately
Suspend to RAM
. status on x86?
User Visible Features/Issues
- Userspace Interaction
. Hotplug calls - when?
- Device Power Management
. Interface?
. Timers/events?
- X Interaction
. Request for notification for saving 3d state
. Other issues?
---------------------------------------------------------------------------
Valid Complaints
. hard to use correctly / not documented well
. struct kobject and sysfs are too interconnected
. struct kobject and struct kref are too big
. sysfs takes up too much lowmem
. bus lists are locked when probing
Future:
. fix documentation
. make struct kref smaller
. make struct kref work with rcu
. fix sysfs lowmem issue
Future: 2.7
. make API harder to use incorrectly
. split struct kobject and sysfs
. make struct kobject smaller
. multithreaded device probing
----------------------------------------------------------------------
PHICS Drivers - Kernel video card drivers
- Keith Packard
- current nightmare
- general co-processor problem
- requirements
------
- fbdev "owns"the graphics card
. mode selection
. text mode support
- DRI "owns"the video card
. interrupts
. dma
. some memory management
- Xservers "owns" the video card
. mode selection
----
General Co-processor problem
- Video cards are co-processors
. separate memory space
. separate processor clock
- Need resource management
. memory management
. scheduling
The x86 cpu is the slowest processor in your machine, graphics card much faster ;) if you have a cache, you have a problem
-------
Memory management
512 mb cards available today
. do the math
. most video card data can be discarded
. need real shared allocator
Data in video card is discardable, comes from the hard disk, i.e. apps can
redo it when coming back from suspend
-----
Mode selection
- Kernel mode
. sometimes needs vm86
. sometimes large
. performance insensitive
- user mode
. needed at boot time
. needed for panic
- mixed
. kernel mode api
. may use hotplug-like helper
Get DRI, etc and unify, blasting OOPS messages on the user is needed
Linux is what matters, HP doesn't sells any BSD systems anymore, so if
we need to change how X drivers work, getting out of the polling mode
nightmare that we have today, 90% of the time XFree86 is waiting for the
graphics coprocessor, if the other operating systems doesn't follow suit...
well, keep using the existing hardware. Keith is rewriting most of the
X server.
Video drivers into the kernel, API drift, kernel releases may make it
"interesting", etc. 3 months from X server release to drivers released for
it.
-------------------------------------------------------------------------
5 minutes
Linus: 2.6 is doing very well, not a lot of complaints about 2.6. Ask a few
people about 2.6: they're happy. That is not good ;-) Linus excited about
bitkeeper, Andrew is not. Linus wants to open 2.7 soon. To think about for
tomorrow 2.7 Kernel Summit closing session.
Willy/Andy/Linus on PHT, the MTRR killer on PCI-X
Scalability: problems on big machines, up to 4 or even 8 it is ok, more?
problems, more RCU usage, etc.
kernel ABI, headers, feedback first from the GLIBC hackers or do it and
present to them? Linus thinks its not gonna happen, hch thinks it may
well be possible, perhaps going to Jakub and not to Ulrich Drepper? ;)
OSDL on test environments, multisystems, no need to go to some web page,
receive the results by e-mail, etc, report only the deviations, etc.
Rewrite the clock system, S/390 has issues, PPC64, small embedded systems, etc
cpu timers, not jiffies, X needs something like jiffies, etc
Muito bom Augusto, alguem ai poderia traduzir, heheh.
Esqueci de agracer o ACME tb. Muito bom.
Eu gostaria que a interface X fosse mais valorizada! Torço para que o trabalho do Sr. Keith Packard seja sucessivo!
Certamente, isso é uma anotação de um geek. :-)
Acme é fera .
nelsonvn:
Infelizmente, o que mata qualquer projeto que tenta criar um sistema gráfico moderno são os hardwares. Em outras palavras, não adianta muito ter um excelente projeto como o directFB e o xfree se as empresas que dominam o mercado de placas aceleradoras não ligam, e não liberam suas especificações.
Tem um colega meu, critico fundamentalista desse modelo de desenvolvimento, que me questionou se tinha empresas que levavam a sério o Linux. Bom, a um bom entendedor poucas palavras bastam, a resposta do elemento tá ai em cima. Fico contente pelo suporte que a indústria está oferecendo ao kernel do Linux. Um bom exemplo a ser seguido pelos fabricantes de outros tipos de processadores (aceleradoras gráficas, DSPs, etc.). Levando-se em conta as rivalidades historicas de cada um, todos ganham ao oferecer as especificações e aderir a esse modelo aberto de desenvolvimento.
Comentários desativados: Esta discussão é antiga e foi arquivada, não é mais possível enviar comentários adicionais.