I just realized that the report and presentation I wrote about Coloquinte were never made public. Here they are, under CC-BY-SA. Please use it as you see fit: to learn about placement algorithms, as a bibliography, or maybe to improve Coloquinte or other tools!
I spent part of the weekend trying to improve the legalization algorithm in Coloquinte – now integrated in Coriolis. It turns out that I didn’t manage to obtain any improvement: every change I tried obtained worse results than the current version.
The current algorithm doesn’t even legalize the true placement: it works on a modified placement obtained by heuristically solving a flow problem. Although this placement is supposed to be close to the best legal placement, I didn’t expect it to stand against more sophisticated heuristics that use it only as a hint.
I think that what makes it so good is that this first pass is inherently imprecise and optimistic: within the finer grained window, it doesn’t move the cells as much as it should. They are moved very close to their final position, but still a bit off in the direction of their target position. Therefore, the second pass receives a good hint placement which is still a bit biased toward the target.
Whatever the reason is, I’m finished with the legalizer. I don’t think there is much more work to do in this area, and I’m convinced that it is better than published works like Abacus and HiBinLegalizer. Sadly, benchmarking against them involves finding and installing them and MPL, which is all but trivial. Our legalizer should be good enough for us now (at least for standard cells, big macros are still a pain).
The ISPD conference organizes a design automation contest each year. It provides a lot of public benchmarks and results to compare to. Although Coloquinte does not yield the best results yet, it compares favorably against most published results. Here are some screenshots of wirelength-driven placements on some ISPD05 benchmarks.
Variable densities are placements are now possible in Coloquinte. I am quite happy with the way it is done, by changing the placement region density rather than the cells’ areas like in other tools.
I implemented it using a line sweep algorithm, which makes it flexible: it can be fed overlapping regions and handles both macros and limited density regions in the same way. It remains to be seen if I can make a satisfying routing-driven placer.
Coloquinte is now used by the Coriolis toolchain. This means that you can now route the circuits obtained – and with an open-source tool.
Here are some screenshots of a circuit obtained with Coloquinte and Kite (the router). The next big step is to make congestion-aware placements: although Coloquinte and Kite optimize well enough that there is no routing failure on small circuits, this feature is mandatory in an industrial tool.
I find it very motivating to finally see what the algorithms do. Running the algorithm on a true circuit rather than ISPD benchmarks has caught some silly bugs, but now I’m adding features again.
It has been three months since I began this project, and it is time for a checkpoint. I just implemented the last mandatory algorithm, and Coloquinte’s algorithms are now usable to build a circuit.
It does not support standard formats, and since I still didn’t program the backend the results are still ~15% from the best state of the art. But all the core algorithms have been implemented.
The core wirelength optimizer has been the first part I implemented. I chosed to write the sparse linear algebra code from scratch rather than using a library. The whole thing is in fact closer to a true non-linear optimizer, since it never completely solves the linear system, but overall it is very similar to other placement tools.
This tool modifies a solution so that it meets cell density constraints: it drives the optimizer toward an almost “legal” solution, where there are no more cells in a region than allowed by its area. It is although used as a pre-processing step of a true legalization.
It is very simple but extremely different from other algorithms. It distributes the cells in the regions and improves the distribution between nearby regions. There are powerful algorithms to do it, but the simplest heuristics seem to work best.
The legalizer is the final step to obtain a correct placement. It places the cells so that they do not overlap, hopefully as close to the optimized solution as possible.
The naive way to do it is to sort the cells by position and chose the best position for each cell sequentially. Coloquinte’s legalizer uses a single row problem, which allows it to push already legalized cells to obtain a better solution (in Coloquinte it optimizes the displacement, but other legalizers did it for the quadratic displacement). Compared with the other solution, it is extremely stable.
Obviously, a good global placement doesn’t replace a detailed placement optimization: it is the next big step. There are still a lot of things to be done, and a lot of naive heuristics where we could do better, but detailed placement should finally close the gap with state of the art tools.
As I said in the previous post, Coloquinte will be built around an analytical placement algorithm alternating between convex placement optimization and cell spreading: it is one of the most efficient methods published so far.
My early prototypes worked, but the spreading algorithm was a bit too simple to cope with the benchmarks. More importantly, they lacked a clean architecture.
The new goal is to work well on a large class of circuits and the tool should be more modular. It is meant to interface loosely with Paris 6 university’s tools: the Coriolis toolchain from the Lip6 laboratory, where I work part-time from next week, and its database Hurricane.
It should be splitted between 3 independent blocks that transform the Hurricane representation: the global placer, the legalizer and the detailed placer. In order to keep control over the optimization passes applied, each tool is scriptable in Python. This is particularly important for the global placer, where everything is tunable from the spreading schedule to the net models.
The global placer
The global placer uses a technique known as quadratic placement where the wires are modelled as elastic springs: this yields a sparse linear system, which can be solved approximately in linearithmic time. It is not as artificial as it seems: it is in fact very close to Newton’s method for optimization.
This quadratic placement alternates with a spreading phase, whose result is fed back to “anchor” the next placement iteration. Everything is tunable: I intend to let the user chose the net model, the legalization algorithm and how strong the cells are pulled toward the legal position. Each of these components gets a Python interface so that the whole optimization is scriptable.
You may note that I include “Companion tools” that could interface with the global placer: such tools are necessary when optimizing for complex objectives. I build Coloquinte so that it can cope with other tools with minor modifications in case we want to extend it.
The detailed placer and legalizer need not as much controllability, but I am going to use the same architecture nonetheless: the more the designer can control, the better the result. Ideally, we would try multiple combinations of the optimization passes to obtain the best tradeoffs: they can be distributed as several Python scripts, possibly completely different.
Today, I begin to program the legalizer. Since this algorithm is completely new, I am probably not going to talk about it anytime soon, but I have a (partially) new one for detailed placement that I am going to present in the next post.
The choice between analytical placement and partitioning is a crucial and early one. For Coloquinte, however, I made this choice a long time ago.
There are two obvious reasons: analytical placement generally performs a little better and I know it better.
However, there are deeper reasons why I want to commit to using analytical placement. In my opinion, analytical placement yields better flexibility and modularity.
First, it is tunable with simple but numerous parameters. By changing the cost function, you can in theory optimize for timing and power. The spreading forces control the tradeoff between speed of convergence and solution quality.
With partitioning-based placement, you need to design whole new algorithms for power, timing and congestion optimization (which many papers did). With a complete analytical placer, you can test parameter tuning within minutes: change the cost function and the spreading schedule, then execute. And there are a lot of cost functions to chose from.
Another important property of some analytical placement algorithms is that they modify an existing placement at each step, not build it from scratch. For this reason, it is convenient to perform netlist or placement changes. Whether to interface it with other programs (performing cell resizing and buffering for example) or to enable engineering change orders (ECOs), it is a useful feature. I suppose it makes it easier to adapt the placer to new problems, with all the performance-critical code factored between the solver and the legalizer.
Therefore, I am going to use an analytical placement that continuously spreads the cells: if region constraints are used, circuit modifications are not as easy. Additionally, this method is the leading one on benchmarks: using it should improve both flexibility and efficiency.
That doesn’t mean writing a complete placement tool is easy: external tools are needed to analyze timing and congestion, but with analytical placement the interface may be very loose and quite modular. I am going to show Coloquinte’s interface in the next post.