Adam Jackson on X release status and process
We do have a list of (70) bugs blocking the release (10101):
But nobody other than Adam has ever looked at that.
It seems we're all heads down now. There were lots of good talks at XDS in the fall about new stuff that was coming, (Gallium, etc.). But it's
Bug 13795: With Mozilla and cairo now using Render, we're using lots of parts of the server/drivers that have never really been used before. As expected, they are broken.
Some of the breakage is avoidable with XAANoOffscreenPixmaps.
Can we drop XAA altogether? Not for this release. EXA isn't quite ready yet. And Glucose isn't happening at all.
Ajax: I'm feeling kind of alone here. Nobody else is looking at this stuff? Daniels: I just closed a blocker. One fixed.
Ajax: I don't feel like I'm scaling that well. There are huge swaths of the code that nobody cares about. Things change and break, but nobody takes responsibility for things.
Even the drivers that are "well maintained" have problems. We see lots of churn in the drivers, but not releases. Releases are cheap guys! I don't have a solution for that except shaming people---maybe this is how I get a start.
Every 6 months or so we have a session where we say "what do we want in our next release?" and we write them on the whiteboard. Then, several months later someone, (lately, Ajax), looks at the list and sees which ones are actually coming together, (about 50%), and decides those are the ones that actually go in.
Example of the picaccess change that was on the "desired" list for several releases, but never went in. Finaly, we sat down and forced it in, but found that only the "top 3" drivers had been ported to it, and everything else broke. Fortunately, someone stepped in to save things after the fact. The problem is that we don't have anyone signing up for that janitorial work---or a commitment to "don't break the drivers". But we could have done the whole pciaccess change in-pace with a compatibility layer so that nothing would have ever broken.
Jesse Barnes: We had a big flamewar about putting the drivers back "in tree" in one git module or whatever. That never came to consensus, but wouldn't it help? Right now, nobody cares about the "server" and the distribution maintainers end up taking on all the pain. Wouldn't it be better if driver hackers had a vested interest in getting the "big blob of server and drivers" into a releasable state together.
John: Is the big problem lack of resources for driver development?
Someone: Or just lack of policy?
Ajax: I don't think it's lack of resources. If we had done pciaccess correctly, then it wouldn't have been any work---just a sed script.
Daniel: Bisectability is really nice. People can bisect an independent driver and find Jesse's driver bug. But if we had the big blob, and someone tried to bisect, then it would all just be "Oh my God, my keyboard doesn't work". So if we did put everything together, we'd need lots of separate trees, (input, etc.).
Ajax: Do git submodules help us address the issue?
Eric: Zack's done more work with the submodule thing, but I don't think it gives us what we want. We don't get a master supermodule that points to all the "master" branches of the submodules. Instead, it has to have exact sha1 sums for each point. It would be great for tagging releases, but not so nice for "build master".
[Discussion: What was the whole point of modularization?]
Ajax: It's great to be able to build the server without having to build xedit.
Daniel: It was great for distributions.
Jesse: The only thing that makes sense to merge back together is the server and drivers. It's a social problem, but this technical change would help.
Kevin: That puts structure onto the technical side of things, but not any structure onto the social problem.
Ajax: The problem with merging things together is that it doesn't address the modularization problem. What happens when I want to package up the server, but the Intel driver happens to be broken at the time? If all I can grab is one point in the development of everything, then there's a problem there. How can I back out just the Intel driver changes?
John: It sounds like we want monolithic build and test, but we want modular packaging and distribution.
Ajax: One of the problems is that we're just not testing in the first place. And that's a big part of the problem.
Ajax: We actually do have a test suite (XTS) that we got released under a free license a while ago, but nobody is actually running that. And nobody is adding new tests.
Keith: But we can't add tests to that test suite. We are writing new ad-hoc tests. Do we create a new test suite infrastructure? Do we use something like piglet?
Bart: We don't have to solve all our problems with automated tests. We do need to put together the social organization, (maybe even pay people), to make things happen.
[Someone]: We have a lot of server bugs already without automated tests and they already aren't getting worked on.
Jesse: We're getting a little off-topic here. The original problem was that Ajax isn't getting any help with release management.
Daniel: Release manager is a misnomer anyway. There's no herding of others doing work or anything. It's just signing up to be the person that fixes all the bugs.
Eric: The advantage of automated testing is that you find out the regressions quickly and you only have a couple of commits to look at to find and fix the bug.
[A few minutes more discussion on technical vs. social problems, lots of missing testing from anybody. Lack of a decent infrastructure for writing new tests.]
Kevin Martin on roadmap and introspection
Our goals include:
- Embrace the dynamic world; hotplug of input and outputs, and all the technical requirements for that
- Smooth, flicker-free user experience; graphical boot, vsync, minimal visible redraw
- Seamless integration of Composite with the rest of rendering; redirected GL/Xv, Composite by default, no software fallbacks
- Secure environment, run X as unprivileged user
These are, of course, the same goals they've been for the last three years or so. We're making progress, but we're not there yet.
What have we been doing?
- RANDR 1.2 integration with Gnome (ssp)
- DRI2 (krh)
- DRM and kernel modesetting (airlied)
- Render acceleration (cworth)
- Cairo 1.6 and pixman (cworth and ssp)
- Shatter and other multiscreen infrastructure (ajax)
- Driver work (all)
We need to finish. What do we need to finish?
Memory management is critical. And it's still not done. How do we finish this?
DRM, kernel modesetting, VGA arbitration.
Monitor hotplug. Have the desktop-side configuration support, really just need the event from the hardware.
- Ensuing discussion about how to wire it up, how to persist beyond the session. Working in Gnome, not propagated out to the global config yet.
- Does the top level win, or does the bottom level win?
- What do we do for input? (Implementation details)
Sync to vertical retrace.
- Intel mostly feature-complete, needs stabilization.
- AMD making excellent progress, but not feature complete.
- nv/matrox/etc are nowhere near done.
Suspend/resume. Oh dear. Matthew will talk about this later.
Video. MC and iDCT and such need to get properly implemented everywhere.
Zack Rusin on Gallium
Gallium is a rework of the DRI driver model. The driver ends up being rather large and opaque, and writing new ones is rough because it's so large.
In Gallium, the driver is split up to isolate the core logic from the hardware from the window system from the OS. The core logic assumes that the driver is shaderful.
This allows better reuse (state tracker stays the same, only the hardware driver changes).
This allows better portability to other OSes or windowing systems.
Still under development, but mostly interface-stable.
Have several drivers already (i915, i965, softpipe, cell). External projects for nouveau and R300.
The big pieces are the state tracker, the winsys layer, and the core driver itself.
State tracker is the API: OpenGL, OpenVG, D3D9, etc.
- winsys layer talks to the X server and/or the kernel for environment support
- the core driver encapsulates all the hardware knowledge
- "draw", software vertex walk
- CSO, constant state objects, responsible for optimisation of state emission
- buffer management code
- TGSI code, internal representation of shaders
- LLVM integration
It works! Software rendering path works. i915 works on both X and Windows. Works on PS3.
Fallbacks are hard. Fallbacks are really hard. Ideally you wouldn't have them, but that requires a complex state tracker, since OpenGL is really big and weird. Being worked on.
Matthew Garrett on power management
APM used to exist. It more or less worked! But when it didn't, there was no recovering.
Open Firmware exists, but isn't typically relevant for desktops.
Embedded things have their own stuff.
ACPI basically makes no guarantees about what happens when you come back from resume.
The steps are: call device suspend callbacks, platform Prepare-to-sleep method, platform Go-to-sleep method, (suspend), random BIOS stuff, platform Back-from-sleep method, platform wakeup method, device resume callbacks.
ACPI doesn't require that the platform do anything, but it also doesn't require that the platform do nothing.
Some machines will reenable text mode in BIOS. Sometimes that happens during ACPI methods. Sometimes that only happens if the platform thinks Linux is running. But you'd really rather not go back to text mode at all. Ideally, you'd restore state and go back to graphics directly.
Swapping out all the state is kinda difficult, but doable. The really hard part is getting the device into a consistent state before doing the state restore. Then you need to block X until everything is back to a reasonable state.
(Lots of implementation details)
How do you survive video hotplug? Actually, you're not expected to be able to.
Need to be cleaner about input hotplug, not doing it right right now. Just a bug.
Ben Byer and Jeremy Huddleston on X11.app
DDX module in hw/xquartz/, old hacked-up version of Mesa
Everything else is plain-jane Xorg, packaged up as X11.app
- XonX: port of XFree86 to Darwin IOKit framebuffer
- merged into XFree86 as XDarwin
- enhanced to do rootless mode: X windows and OSX windows in the same display
- Apple branched XFree86 4.3 and included it in the OS
- Pleasant experience for Unix apps on a Unix OS
- Support all traditional X11 technologies on a Mac (GLX, drag and drop, copy/paste, rootless)
- Cater to wide range of users: sysadmins, developers, research, design
- Take advantage of OSX features
- X11.app and XDarwin evolved in parallel
- XDarwin pulled most improvements from X11.app, X11.app pulled XFree86 updates
- X11.app brought to release quality and parked in maintenance mode
- Resync with XDarwin depended on volunteers
- XFree86 licensing change forced a move
Leopard X.org resync
- BSD team in Apple Core OS group took on task of porting X11.app to "new code"
- XFree86 code base severely bit-rotted
- Xplugin shim didn't keep up with the rest of the OS
- Non-existent developer community
- Limited Apple resources
- Functional, but some regressions, limited testing
- Installed by default on Leopard
Further open source push
- xquartz.macosforge.org - trac instance, etc
- Code maintained in freedesktop.org git branch, binaries on macosforge
- x11-users mailing list, xquartz-devel
- Picked up an intern! Hi Jeremy!
- Discarded old Darwin code, renamed DDX to Xquartz
- apple branches in git
Built like a normal X server, configure && make && make install
- Limping along on old Mesa code though
- Quirky input code
- Crashy rootless code
- Stale DRI code
- New DRI driver using OpenGL.framework
- Fix rootless and/or replace with Composite
- Better input and RANDR support
- Smarter clipboard proxying
- Move to open source window manager
- Sync with git master
Kristian Høgsberg on DRI2
Debriefing: A one-time, semi-structured conversation with an individual who has just experienced a stressful or traumatic event.
No kernel side
- uses only buffer objects and associated API
- drmAddMap, Create/Destroy Context/Drawable are no longer used. yay!
- intel driver has a driver-specific ioctl for communicating the hardware lock with the kernel
- possible new ioctl for mapping a set of buffer objects (rather than just one)
Simpler DDX driver requirements
- fbconfig come from the DRI driver instead of from the DDX
- back buffers allocated on demand by the DRI driver
- sarea is a buffer object rather than a magic map
- new self-describing block format for sarea records
- sarea still necessary for describing cliprect changes
(log showing dynamic buffer allocation)
Event ring buffer
- Used for communication from X server for window position and cliprects, and drawable to buffer-object mappings
- X server is only writer. two head pointers: end of valid data, and end of event currently being written
- client maintains per-drawable tail-pointer, checks for updates by comparing to server head pointer
- currently checked every time you take the lock
- protocol for catching up if the client falls too far behind the server
Initialization and operation examples
- Server setup: DDX driver opens DRM device, calls dri2 to initialize the sarea, with optional size for driver-specific blocks
- Client setup: If DRI2 extension is available, call DRI2Connect for busid, driver name, sarea buffer object handle; call DRI driver's createNewScreen, which doesn't set up any pre-sized render buffers.
- Tell DRI2 module that we're interested in a given X drawable. Server responds by logging it to the new sarea.
- When the drawable is bound to a context, the DRI driver parses the sarea event log and sets up the necesary render buffers.
- GLX_EXT_texture_from_pixmap in direct contexts for free
As the server moves windows around, the cliprects change and you have to blit. You have to make this appear atomic to clients, otherwise the pixels and cliprects will be inconsistent
- Introduce two new submit methods:
- submit and update
- compare and submit
Probably makes sense to move buffer swaps to the server
Technical discussion on how to implement sync to vblank.
(Super hot demo)
Eric Anholt on Render
What do app authors want?
YUV picture formats as a new source picture type
JPEG/PNG picture formats for compressed image transport
YUV needs new filter types for hue/contrast/etc
Gradients need to be properly specified, should just copy the cairo spec
Gradients need dithering, and for filters to work in general
Technical discussion of how to do automatic mipmapping
Rendercheck needs a ton of work
Transforms need to be floating point? Or at least need to be specifiable in floating point
(Søren rants for a while about the imprecision of the spec)
Jérôme Glisse on Radeon
Kernel modesetting is in progress, for all the usual reasons: suspend, flicker-free boot, reducing root code in userspace, graphical boot, graphical panic
Really hard to preserve compatibility, so a new driver sounded like a good idea. Problem is the existing code is well-tested and also tedious to rewrite. Rewriting it was a mistake; should have used the existing DDX code to begin with.
Why ATOM? Should work since Windows uses it. Reuses AMD engineering work. Keeps the driver from needing to know hw details. But, sometimes buggy and not always exactly what we want.
What do users want? Working 3D. Accelerated video. Modesetting is uninteresting as long as it comes up.
So, use ATOM to leverage the existing AMD engineering work. Reduce time to support new hardware, get on to the fun features.
AMD GPU design is "like Lego blocks". Different GPUs share same modules (DAC/TMDSC, 3D engine, CP), but can move around between GPUs. Can and does. But ATOM takes care of most of this for us.
Kernel code should reflect this modular design; each piece should talk only to the piece of the chip that it's responsible for.
Lockups are easy to trigger. And, surprisingly hard to recover from. (Technical details.) Need to sanitize the command stream to be sure you don't do weird things, and avoid the WAIT_UNTIL feature. But this makes vsyncing hard.
Lots of cases where the GPU can stall. Fencing is really hard; need to be careful about batching them. Some buffer offset registers stall until they can be done safely, so you want to avoid doing them. Context switches need some optimization for this case.
Fragment programs are essential now, and optimisation is critical so you correctly run everything you claim to be able to run. LLVM will save us all.
Texture indirections are a finite resource. You can reduce them by rescheduling the shader.
r300 and r400 don't support all swizzling combinations. Want to redo shaders to use native swizzling when possible.
Future work: DRI2, radeon gallium driver, get gears to work, fragprog compiler.
David Schleef on video
Background: working on Dirac, which is a codec from BBC that's designed to be competitive with MPEG4.
Currently: 8 bit per channel, normal color gamut, 480i/30 or 720p/30.
Future: 10-12 bit per channel, wide gamut, 1080p/60, possibly stereoscopic.
- decode video
- yuv to rgb
- gamma to linear
- rgb to device color space
- linear to gamma
- calibration curve
Colorspaces we care about:
- BT-709 (sRGB) - the web, HD everything
- BT-601 - DVDs, SD TV, other old stuff
- Whatever your monitor uses
- CIE XYZ (Digital cinema, wide-gamut video)
(Explanation of a chromaticity diagram)
- Could go anywhere, somewhat depends on the colorspace
- use a bicubic filter or better
- Scaling is hard to do well, near-axis-aligned lines will alias badly
What do apps want to know?
- Exact frame rate of the output pipe
- Exact time of image presentation
- Sync to vblank
- Good sides: Zero copy with the SHM extension, simple, maps well to the original problem domain
- Bad sides: attributes underspecified, lots of useless attributes, missing modern details, slightly desynced with X rendering, doesn't work in cairo
- Good: works everywhere, tends not to use software fallbacks
- Bad: vertical softening, bad scaling, bad conversions, no rgb support, only one overlay
- Good: lots of control
- Bad: needs GLSL for that control, very complicated, no native YUV, three ways to do vsync, not meant for video, can't connect XVideo port to OpenGL
- Good: intel works, mostly; nvidia works, windows works.
- Bad: works inconsistently
- Pluggable pipeline
- Extend XVideo
- GPGPU codecs
Alex Deucher on R600 Architecture
R500 was 3d build on R300. Very similar designs.
(Diagram of GPU hardware evolution)
- New surface setup
- Expanded CP, Prefetch processor (PFP)
- New paging and DMA engine
- New interrupt handler
- Virtual memory support
- Unified shaders
- Compose rect engine
Host data path
- 32 surfaces for tiling and swapping
- Can do virtual or physical addresses
- Privileged and non-privileged access per surface
- Same basic setup and semantics as previous generations
- New prefetch parser
- Dedicated DMA engine
- Lots of new Type 3 packets
- All drawing and state can go through the CP, no need to touch registers
- 2D packets emulated by the CP
- Insert interrupts for event completion
Paging and DMA engine
- Has own ring buffer, async from the GPU
- Host BLT, copy BLT, fences, interrupts
- Different packet formats from CP
- Dedicated ring buffer for incoming interrupts
- Source ID to map interrupt to originating block and subfunction
- Translation of logical addresses seen by clients into physical addresses seen by the memory controller
- Page fault detection (maybe not validated)
- Page access bits (also maybe not validated)
- One big program of control flow, ALU, or fetch clauses
- 128 bit pixels, 5 way superscalar, access to vector and scalar results from previous insn
- Five main types of shader: vertex, pixel, geometry, DMA copy, fetch
Compose rect (crazy 1-bit text accel thing)
- TCORE programming SDK soon
- Programming guide
Keith Packard on RANDR 1.3
- DPMS events
- Per-output DPMS
- Panning rectangle
- Projective transform
- GPU objects
- CRTC properties
- Standard output properties
- Goal is power management without polling
Basic plan: Enable/Disable requests, OutputDPMS request, OutputDPMSChanged and EnableChanged event, IDLETIME sync counter
- Allows DPMS transition to be decoupled from "idle"
- Provides DPMS notification to applications
- Leaves legacy auto-DPMS stuff around
- New enable and disable are refcounted. Event includes the refcount.
- Bounding box for the CRTC, just moves the CRTC origin around
- CRTC follows mouse on edge transitions
CRTC events will be generated, probably needs to be a PanningNotify for lack of room
- Handles input
- Uses existing rotation backend
- No driver impact
- Would be more efficient done in the compositing manager, but wouldn't handle input
- This actually works. Whoa.
- Completes the object hierarchy
- Object, list of properties, mapping to CRTC
Standard output properties
- Provide more information about outputs
- Connector type, signalling level, etc.
- Need a list and a definition of each
Chris Ball on tinderbox
Started working on tinderbox as part of the OLPC bringup effort
Got interested in adding that test harness to X
It works! tinderbox.x.org
Gives you a long list of build results, properly hierarchical, triggered by commits.
Also has some basic test infrastructure for sanity-checking the builds:
- cairo tests
Not really clear how to measure failures. Some things never worked. Doesn't have a success metric for the tests right now besides "ran to completion"; should change to catching changes in numbers of passes/fails.
What else do we need?
- mozilla trender
- cairo tools could be rewritten to get a more parseable output
- possibility of making throwaway "tinderbox me please" branches for pre-commit testing
- possibility of keeping build output for the last N things so you can apply new test against older builds
- more machines
Bart Massey on image support in XCB
XCB is a replacement for Xlib: a thin C binding with latency hiding and transparent multithreading
Just does a protocol binding and some language bindings
... and then also has a util library
In attempting to port x11perf to XCB, noticed that the image implementation is broken
X image protocol is subtle and quick to anger. Not many apps use the Xlib facilities in interesting ways.
The gory details
- Decide XY or Z format, size, depth
- Walk the connection data looking for valid formats
- Compute unit, pad, byte and bit swap, size; deal with planes
- Maybe convert or expand
- Create new image with given or native format
- Convert images between formats
- fast get/put pixel inlines
- User-controllable memory management
- Protocol insulation
xcb_bitops.h provides some fast bit-banging inlines
Forced some reevaluation of some XCB design
- error handling is weird
- latency hiding in libraries
- build mechanisms
- util packaging
Requests for help and further work
- Finish the x11perf port
- Image test suite
- Specify or build other userland libs
- Port apps to XCB
Stefan Dösinger on Wine and X
Wine is a set of libraries to allow running unmodified Windows apps.
Uses X for input and output, translates GDI and DirectX to X APIs
Raw mouse input
- Windows can report relative mouse movement
- Mouse and keyboard notifications are windows independent
- Client-side workaround, but fails when you hit the screen border, or when the app changes focus
- Should be doable with XInput
Tablet support doesn't really exist yet, but that's being worked on in X already
3D rendering issues
- Flat shading colors; different behaviour in GL versus D3D. extension proposal, should be easy.
- pbuffers need to be supported in the driver, pain to implement them with FBOs. DRI2 should solve this.
- multithreading. Our drivers break when multiple contexts run on one drawable.
sRGB rendering targets. GLX extension is stronger than D3D9 extension, so only works on DX10 hardware. Could get away with ability for write correction without blending.
- GLSL. Conversion from D3D done with generated GLSL, depends on the driver to optimize.
X.org Foundation Board on the X.org Foundation Board
Members! Be members. Go to members.x.org
We exist to provide education, support, and infrastructure to enable X development. If you think you could use some of that, let us know! We will do what we can.
Meetings and conferences are not necessarily limited to XDC and XDS. If you want to do something smaller and local, let us know! We will do what we can.
The software has been coming out. See elsewhere for details.
Vendor relationships keep getting better. Docs and source keep coming out. Life is good.
We are doing GSoC again. We may be doing additional direct student funding again.
We are VESA members, we're about to be CEA members. We should do something about Khronos; if this is you, let us know! Getting this information out to the development community is important, so let us know if this will help you.
The old LLC structure is being closed out, finally. We're within epsilon of being a 501(c)3.
Apologies for the election being so chaotic last time around. There is a team working on improving this process.
Current board meeting structure is bi-weekly for the whole board plus committee meetings as needed.
Membership agreement is being finalised to make membership easier to agree to and sign up for.
We're spending money well now. We will begin looking for sponsorships again. Healthy cash flow is healthy.
Infrastructure between x.org and freedesktop.org is being merged behind the scenes, for the betterment of both.
There is another major conference coming. September 10-12 (ish) is XDS 2008 at Edinburgh Zoo. Conference facilities, probably a sunset tour, general good times. Should have final details in a few weeks.
Eamon Walsh with A Talk
- reworked now.
- They're lazy allocated, way simpler to program, but have some small issues.
- Merge worked, and the changes are documented. Yay!
- xf86AllocateEntityPrivateIndex(), probably won't be touched
AllocateFontPrivate(), also. Not really worth fixing.
- Some minor optimizations to be had with CSE.
Things I hate
- State polling requests suck. "Give me the state of the keyboard! Including button up/down!" Doom.
- How do we fix? Lie? Rate limit? Visual feedback?
- Add a policy framework, start fixing the toolkits
- This is how to steal window contents
- But just censoring to black flickers
Maybe only block GetImage from bg=None windows? Or just fix gtk? Hmm.
Please can we kill more extensions
- Every protocol request is an entrypoint.
- But each one has to be characterised, and it's the ioctl problem.
- (Hateful spreadsheet)
- MIT-SUNDRY-NONSTANDARD. Die.
- APPGROUP. Unused. Die.
- XFree86-DGA. Dead once we get relative mouse motion.
- XFree86-Misc. Die.
- TOG-CUP. Die.
- Xinerama. Enh. Kinda hard.
- DPMS. Can pretty much just go.
XFree86-VidModeExtension. Can die once we fix RANDR.
- Finish some XCB stuff.
- GLX and DRI.
- MPX/Input security improvements.
- Move on to higher desktop layers.
(Technical discussion of how to fix bg=None)
Owen Taylor is working on fixing Radeon Render performance. And making progress! Should be done as generic Render services, so everyone wins.
John Bridgman raises some concern about lockstepping DRM and X as far as getting support pushed out. There's some slip, it works. But it is something the driver maintainers need to be conscious of. Also, what are the last steps to getting X to run as non-root?
Paulo Zanonni and Tiago Vignatti on VGA arbitration. This needs to be done in the kernel to get it right, since each server may want to have VGA space routed to it at various times. Existing proof-of-concept as kernel+library+server code. Should just be a /dev node.
Jesse Barnes on getting off of /dev/mem. Mostly there. How do we expose caching policies to maps? How do we deal with non-PCI video devices like USB?
Adam Jackson on Xinerama and shatter and stuff. If you're interested, go talk to him. http://cgit.freedesktop.org/~ajax/xserver-shatter
Daniel Stone on input. Core, XI, and XKB. Massive amounts of duplication. This is really terrible. Lots of it is being fixed, yay! Being worked on in xkb-atkins branch. This is sort of a prerequisite for merging MPX.
Bart says cut and paste is still broken.