Optimizing bulk update performance in Postgresql with dependencies

Basically my question is the same as this one, but WITH dependencies, so drop/renaming the table is not a trivial option (I assume).

We are refactoring a large, poorly designed table which has many columns and references to it. It currently has a text field that should be a foreign key. The naive update looks like:

UPDATE myTable SET new_id=(SELECT id FROM list WHERE name=old_text);

The above takes practically forever because the table is large, and basically gets temporarily doubled due to UPDATE being equivalent to INSERT/DELETE.

We do not need everything done in one transaction. So we are considering some sort of external script to do the updates in batches of 5000 or so, but tests indicate it will still be painful/slow.

Suggestions on how to improve performance?

Why does mysql server packages have perl dependencies in linux distros?

I’m trying to clean out some unneeded packages from one of my gentoo boxes with emerge --depclean, and I thought I had a few perl modules installed that none of my wanted packages should require.

So, I was a bit surprised to see that:

dev-db/mysql-5.5.39 requires >=dev-perl/DBD-mysql-2.9004

Shouldn’t it be the other way around? Why on earth is mysql dependent on a perl package?

The official MySQL documentation only says that perl is required if running the test scripts when/after compiling from source.

I use the IUS releases of the LAMP (where P means PHP) stack on my CentOS boxes, and the mysql55-server-5.5.39-1.ius.el6.x86_64 package has for instance these requirements (obtained with rpm -qR):


Is there really a need for these requirements on the server packages?

Cross-compiling Slackware: is the build order listed anywhere?

I’m building a Slackware system from source and hitting a dependency wall here. (Before you ask: no, I’m not trying to “make it faster”; I’m building against a different C library.) Getting a toolchain and the very basics (coreutils, archivers, shell, perl, kernel, etc.) was simple enough, but when I look at the remaining several hundred packages I don’t know what order they need to be built in to meet their dependencies.

Looking through the various docs I don’t see any build order listed, and there’s no master build script either, just the individual packages’ SlackBuilds. And .tgz’s don’t list dependencies like deb’s or RPM’s do. Is this just something Patrick keeps in his head that the rest of us mortals will have to figure out manually, or am I missing a doc somewhere?

I tried using BLFS but Slackware seems to build X much earlier in the process than BLFS does. I suppose I can simply try to build everything, note when dependencies fail, and manually make a dependency tree, but I’m hoping there’s just a build list somewhere I’m missing…

Are there two type of associations between objects or are there just different representations?

I’ve been spending some time on ‘re-tuning’ some of my OOP understanding, and I’ve come up against a concept that is confusing me.

Lets say I have two objects. A user object and an account object. Back to basics here, but each object has state, behaviour and identity (often referred to as an entity object).

The user object manages behaviour purely associated with a user, for example we could have a login(credentials) method that returns if successfully logged in or throws exception if not.

The account object manages behaviour purely associated with a users account. For example we could have a method checkActive() that checks if the account is active. The account object checks if the account has an up-to-date subscription, checks if there are any admin flags added which would make it inactive. It returns if checks pass, or throws exception if not.

Now here lies my problem. There is clearly a relationship between user and account, but I feel that there are actually two TYPES of association to consider. One that is data driven (exists only in the data/state of the objects and the database) and one that is behaviour driven (represents an object call to methods of the associated object).

Data Driven Association

In the example I have presented, there is clearly a data association between user and account. In a database schema we could have the following table:


When we instantiate the account and load the database data into it, there will be a class variable containing user_id. In essence, the account object holds an integer representation of user through user_id

Behaviour Driven Association

Behaviour driven associations are really the dependencies of an object. If object A calls methods on object B there is an association going from A to B. A holds an object representation of B.

In my example case, neither the user object nor the account object depend on each other to perform their tasks i.e. neither object calls methods on the other object. There is therefore no behaviour driven association between the two and neither object holds an object reference to the other.


Is the case I presented purely a case of entity representation? The association between user and account is always present, but its being represented in different ways?

ie. the user entity has an identity that can be represented in different forms. It can be represented as an object (the instantiated user object) or as a unique integer from the users table in the databases.

Is this a formalised way of recognising different implementations of associations or have I completely lost my mind?

One thing that bugs me is how would I describe the differences in UML or similar? Or is it just an implementation detail?

How to include in the result of a `SELECT … GROUP BY …` all the other columns that are functionally dependent on the grouping ones?

I’ll base this question on a toy example.

Let this be table A:

 U | V | W | X | Y |  Z
 a | b | c | 1 | 6 | 8.3
 a | b | c | 1 | 4 | 3.7
 a | b | f | 3 | 4 | 2.6
 a | b | f | 3 | 2 | 6.0
 a | e | c | 1 | 0 | 3.5
 a | e | c | 1 | 5 | 8.8
 d | b | f | 1 | 0 | 3.5
 d | b | f | 1 | 3 | 2.3
 d | e | c | 2 | 6 | 2.2
 d | e | c | 2 | 4 | 3.3
 d | e | f | 0 | 7 | 5.0
 d | e | f | 0 | 6 | 3.6

I can produce a second table B by grouping the rows of A by columns U, V, and W, and computing the average of column Z for each group.

 U | V | W | Z_avg
 a | b | c |  6.0
 a | b | f |  4.3
 a | e | c |  6.2
 d | b | f |  2.9
 d | e | c |  2.7
 d | e | f |  4.3

The SQL for this would be something like


But I want the new table to include all the columns of the original table that have a functional dependence on the grouping columns U, V, and W. In this example there is one such column, namely column X.

In other words, I want to generate the table C shown below:

 U | V | W | X | Z_avg
 a | b | c | 1 |  6.0
 a | b | f | 3 |  4.3
 a | e | c | 1 |  6.2
 d | b | f | 1 |  2.9
 d | e | c | 2 |  2.7
 d | e | f | 0 |  4.3

So this problem has two parts, at least conceptually.

  1. How to determine which columns are functionally dependent on
    columns U, V, and W?

  2. What is the SQL to generate table C?

I know how to implement a (say, Python) script that can answer (1), but it is tedious and slow. (Basically, for each of the candidate columns, in this case X and Y, the script would collect all of its values for each distinct combination of values in columns U, V, and Z, and then, if each of these sets of values has exactly one element, then the column is functionally related to U, V, and Z.)

Likewise, once I have identfied the functionally dependent columns, I can muddle may way through (using temporary tables and what not) to eventually end up with something like table C above (thus, effectively solving (2)).

I figure, however, that this task is sufficiently common that there may be standard tools/techniques to carry it out.

cannot install locales in Debian 7 due to unmet dependencies

So instead of including the wheezy-backport in my sources.list I had the bright idea of adding jessie directly. Realising my mistake I cut out of and apt-get update/upgrade midway through and then reset sources.list to wheezy and ran through this list of commands. But when I try to sudo apt-get install locales this happens:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 base-files : PreDepends: awk
 erlang-crypto : Depends: libssl1.0.0 (>= 1.0.0) but it is not going to be installed
 libc6 : Depends: libgcc1 but it is not going to be installed
         Recommends: libc6-i686 but it is not going to be installed
         Breaks: locales (< 2.19)
 libncurses5 : PreDepends: multiarch-support but it is not going to be installed
               Recommends: libgpm2 but it is not going to be installed
 libtinfo5 : PreDepends: multiarch-support but it is not going to be installed
 locales : Depends: glibc-2.13-1
           Depends: debconf (>= 0.5) but it is not going to be installed or
 procps : Depends: libncursesw5 (>= 5.6+20070908) but it is not going to be installed
          Depends: libprocps0 (>= 1:3.3.2-1) but it is not going to be installed
          Depends: initscripts but it is not going to be installed
          Recommends: psmisc but it is not going to be installed
 zlib1g : PreDepends: multiarch-support but it is not going to be installed
E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.

How do I solve these dependencies?

In event sourcing, is it ok to introduce a dependency in my message class?

Following Martin Fowler’s explanation on event sourcing, I have a message class that looks something like this:

    Process(Ship ship) {}

However, in my case, I need to talk to another component in the Process method. More specific, I need to access a repository to get some basic data. Is it ok to add this repository to the method, i.e.:

    Process(Ship ship, IBasicDataRepository repo) {}

I could put it in the constructor of my message too, of course. However, I can’t pass in the basic data, because the repo will be called multiple times with different parameters, depending on what’s in the ´ship´ object.

So is it ok to introduce external dependencies in the event/message class or is there a better way?

How to find the packages that depend on a certain package in apt?

How can I get, not the dependencies of a package, but the packages that are depending on a certain package?

I’m on debian 6.0 Squeeze-LTS (the first-time extension to squeeze for long term support!) for my web server, and it reports that support has ended for a certain package:

Unfortunately, it has been necessary to limit security support for some

The following packages found on this system are affected by this:

* Source:libplrpc-perl, ended on 2014-05-31 at version 0.2020-2
  Details: Not supported in squeeze LTS
 Affected binary package:
 - libplrpc-perl (installed version: 0.2020-2)

I don’t really want to try to uninstall that binary package without seeing what depends on it, and it’s description describes stuff that I’ve never heard of before:

libplrpc-perl: Perl extensions for writing PlRPC servers and clients

So I’d be fine with just removing the package if possible, but want to determine the things that depend on it before doing so.

How to handle passing multiple dependencies in a module hierarchy

So I have my application consisting of a number of modules in a module hierarchy. Furthermore let’s also assume each module is a class and we have a tree of classes where the classes at the top are using the classes below, to make it more simple.

A class A1 at the very bottom may depend on some input parameters. class B1 is above class A1 and is creating and using instances of class A1. Therefore it has to pass the dependencies needed by the instances of class A1 into them. If it can’t create these dependencies from some operations, class B1 now has it’s on dependencies but additionally the dependencies of class A1.
The higher we go the more these dependencies will add up so that the toplevel class will need to know all the dependencies unless they can be created at a lower level.

This means if class A1 my program is dependent on the current temperature, I have to pass this to the toplevel class which then passes it to the next class and so on until it arrives at the very bottom in class A1. If I do that, I make the state explicit but it also means that I have methods or classes that take many parameters.


What if the temperature is may change while the application is running – is there a way to avoid passing it all the way down the class tree without giving away explicit state? How would you guys handle this?


What if the temperature is a constant that will never change while the program is running? Does this give us more options to avoid passing it always as an argument? I could see someone using a global configuration (singleton) but it will make it harder to test right?

I could also pass not the temperature itself but a configuration object. This would mean class B1 does not receive for example a temperature and a airpressure parameter but gets a configuration object passes it to class B1 and class B2 where class B1 only needs the airpressure and class B2 only needs the temperature. Is that a good approach? What are the pros/cons?

Converted ERD into Dependency Diagram 1NF, 3NF

enter image description here

Thats the ERD.

I’m studying for an upcoming test and having trouble getting my head around converting thes ERDs into dependency diagrams. This is from a previous exam that the Lecturer told us to study just btw so he said it will be pretty similar to this.

We have to -

-Convert the ERD into a dependency diagram.
-And then, conver that diagram into a 3NF model, with no transitive or partial dependices.

Any help would be much appreciated as I’m struggling to get my head around it and need to pass this test :/

Thanks in advance!

Question and Answer is proudly powered by WordPress.
Theme "The Fundamentals of Graphic Design" by Arjuna
Icons by FamFamFam