UVA1006 Fixed Partition Memory Management（二分图最优匹配/最小权值匹配）

Fixed Partition Memory Management

A technique used in early multiprogramming operating systems involved partitioning the available primary memory into a number of regions with each region having a fixed size, different regions potentially having different sizes. The sum of the sizes of all regions equals the size of the primary memory.

Given a set of programs, it was the task of the operating system to assign the programs to different memory regions, so that they could be executed concurrently. This was made difficult due to the fact that the execution time of a program might depend on the amount of memory available to it. Every program has a minimum space requirement, but if it is assigned to a larger memory region its execution time might increase or decrease.

In this program, you have to determine optimal assignments of programs to memory regions. Your program is given the sizes of the memory regions available for the execution of programs, and for each program a description of how its running time depends on the amount of memory available to it. Your program has to find the execution schedule of the programs that minimizes the average turnaround time for the programs. An execution schedule is an assignment of programs to memory regions and times, such that no two programs use the same memory region at the same time, and no program is assigned to a memory region of size less than its minimum memory requirement. The turnaround time of the program is the difference between the time when the program was submitted for execution (which is time zero for all programs in this problem), and the time that the program completes execution.

Input

The input data will contain multiple test cases. Each test case begins with a line containing a pair of integers m and n. The number m specifies the number of regions into which primary memory has been partitioned (1 ≤ m ≤ 10), and n specifies the number of programs to be executed (1 ≤ n ≤ 50).

The next line contains m positive integers giving the sizes of the m memory regions. Following this are n lines, describing the time-space tradeoffs for each of the n programs. Each line starts with a positive integer k (k ≤ 10), followed by k pairs of positive integers s1, t1, s2, t2, . . . , sk, tk, that satisfy si < si+1 for 1 ≤ i < k. The minimum space requirement of the program is s1, i.e. it cannot run in a partition of size less than this number. If the program runs in a memory partition of size s, where si ≤ s < si+1 for some i, then its execution time will be ti . Finally, if the programs runs in a memory partition of size sk or more, then its execution time will be tk.

A pair of zeroes will follow the input for the last test case.

You may assume that each program will execute in exactly the time specified for the given region size, regardless of the number of other programs in the system. No program will have a memory requirement larger than that of the largest memory region.

Output

For each test case, first display the case number (starting with 1 and increasing sequentially). Then print the minimum average turnaround time for the set of programs with two digits to the right of the decimal point. Follow this by the description of an execution schedule that achieves this average turnaround time. Display one line for each program, in the order they were given in the input, that identifies the program number, the region in which it was executed (numbered in the order given in the input), the time when the program started execution, and the time when the program completed execution. Follow the format shown in the sample output, and print a blank line after each test case. If there are multiple program orderings or assignments to memory regions that yield the same minimum average turnaround time, give one of the schedules with the minimum average turnaround time.

Sample Input

2 4
40 60
1 35 4
1 20 3
1 40 10
1 60 7
3 5
10 20 30
2 10 50 12 30
2 10 100 20 25
1 25 19
1 19 41
2 10 18 30 42
0 0

Sample Output

Case 1
Average turnaround time = 7.75
Program 1 runs in region 1 from 0 to 4
Program 2 runs in region 2 from 0 to 3
Program 3 runs in region 1 from 4 to 14
Program 4 runs in region 2 from 3 to 10
Case 2
Average turnaround time = 35.40
Program 1 runs in region 2 from 25 to 55
Program 2 runs in region 2 from 0 to 25
Program 3 runs in region 3 from 0 to 19
Program 4 runs in region 3 from 19 to 60
Program 5 runs in region 1 from 0 to 18

链接

https://cn.vjudge.net/problem/UVALive-2238

题意

编程计算最优的内存分配策略，即给定m个区域的大小和n个程序在各种内存环境下的运行时间，找出一个调度方案，使得平均回转时间（即平均结束时刻）尽量小，具体来说，你需要给每个程序分配一个内存区域，使得没有两个程序在同一时间运行在同一个内存区域中，而所有程序分配的区域大小都不小于该程序的最低内存需求。程序对内存的需求不会超过最大内存块的大小。

题解

先来看一个内存区域的情况，假设在这个内存区域按顺序执行的k个程序的运行时间分别为$t_1, t_2, t_3, ..., t_k$，那么第i个程序的回转时间为$r_1 = t_1+t_2+...+t_i$，所有程序的回转时间之和等于$r = kt_1+(k-1)t_2 + (k-2)t_3 + ... + 2t_{k-1} + t_k$。换句话说，如果程序i是内存区域j的倒数第p个执行的程序，则它对于总回转时间的“贡献值”为$pT_{i, j}$，其中$T_{i,j}$为程序i在内存区域j中的运行时间。
这样一来，算法就比较明显了，构造二分图G，X结点为n个程序，Y结点为n*m个“位置”，其中位置(j, p)表示第j个内存区域的倒数第p个执行的程序。每个X结点i和Y结点(j, p)连一条权为$pT_{i, j}$的边，然后求最小权匹配即可。注意，并不是每个匹配都对应一个合法方案，但是最佳匹配一定对应一个合法方案，把倒数第1个程序放到倒数第2个程序处，显然运行时间将减少，因此最佳方案不可能只有倒数第1个程序，而没有倒数第2个程序。
——刘汝佳小白书P352

因为匹配的信息保存在Y结点（n*m个位置）中，所以我们可以遍历m个内存块，找到这些位置保存的n个程序的信息（只有n个程序，所以只有n个位置成功匹配），然后记录每个程序所在的内存编号以及开始时间，注意，内存中存的程序信息是倒序的，详情参看代码output部分。

更详细的题解和说明可以看网上的博客：传送门1，传送门2。

首先，我们可以证明最佳完美匹配一定是满足题意的一个解。因为我们采用了倒数第p个这样的设定。那么对应边权pT(i,j)中，T(i,j)一定是一个定值。那么自然是p越小越好。所以是先有了倒数第一才会有倒数第二。而不会只有倒数第一，没有倒数第二，又有倒数第三的不符合现实的情况。这在其他完美匹配就有可能出现，但最佳匹配避免开了这种情况。采用倒数还有一个好处是如果是设位置为(i,j,k)。表示在内存i中总共有j个程序要执行，该程序是第k个，那么对应边权为（j-k+1)*T(i,k),似乎也行。但是这种设定每个程序也可以和(i,j+1,k+1)匹配。反正在同一个内存中运行时间是固定的。匹配数巨大，也有可能出现不合常理的情况。给写代码带来困扰。

代码

两种写法的最主要的区别就是在于输出方案判断右边内存区域匹配的程序。法一是通过添加虚拟边判断if (Left >= n) // 匹配到虚拟结点，说明本region已经没有更多程序了实现的，法二直接判断if (linker[Right] == -1) // 未匹配到跳过。实测法二43ms左右，法一253ms左右。

写法一：n*m（左，n个程序+虚拟结点）——n*m（右，m块内存每块n个区域）

Status	Accepted
Time	253ms
Length	4680

#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <cmath>
#include <queue>
#include <vector>
using namespace std;
const int maxn = 1010;
const int maxp = 50 + 5;
const int maxr = 10 + 5;
const int inf = 0x3f3f3f3f;
int nx, ny, n, m;
int G[maxn][maxn], runtime[maxp][maxr];
int linker[maxn], lx[maxn], ly[maxn], slack[maxn];
bool visx[maxn], visy[maxn];

bool dfs(int x)
{
    visx[x] = true;
    for (int y = 0; y < ny; ++y)
    {
        if (visy[y] || !G[x][y]) // 注意这里需要判断x和y之间是否有边或者每条边初始化为-inf
            continue;
        int tmp = lx[x] + ly[y] - G[x][y];
        if (tmp == 0)
        {
            visy[y] = true;
            if (linker[y] == -1 || dfs(linker[y]))
            {
                linker[y] = x; // 匹配
                return true;
            }
        }
        else if (slack[y] > tmp)
            slack[y] = tmp;
    }
    return false;
}

void KM() // 二分图最大权值匹配
{
    memset(linker, -1, sizeof(linker));
    memset(ly, 0, sizeof(ly));
    for (int i = 0; i < nx; ++i)
    {
        lx[i] = -inf;
        for (int j = 0; j < ny; ++j)
        {
            if (G[i][j] > lx[i])
                lx[i] = G[i][j];
        }
    }
    for (int x = 0; x < nx; ++x)
    {
        for (int i = 0; i < ny; ++i)
            slack[i] = inf;
        while (true)
        {
            memset(visx, 0, sizeof(visx));
            memset(visy, 0, sizeof(visy));
            if (dfs(x))
                break;
            int d = inf;
            for (int i = 0; i < ny; ++i)
            {
                if (!visy[i] && d > slack[i])
                    d = slack[i];
            }
            for (int i = 0; i < nx; ++i)
            {
                if (visx[i])
                    lx[i] -= d;
            }
            for (int i = 0; i < ny; ++i)
            {
                if (visy[i])
                    ly[i] += d;
                else
                    slack[i] -= d;
            }
        }
    }
}

void output() // 具体方案
{
    int start[maxp], region_number[maxp], total = 0; // 程序p开始时间，所在内存编号，总时间
    for (int region = 0; region < m; region++) // 起始时刻、分配到得区域编号、总回转时间
    {
        vector <int> programs; // 本region执行的所有程序，按照逆序排列（因为是指“倒数”第pos个程序）
        for (int pos = 0; pos < n; pos++)
        {
            int Right = region * n + pos;
            int Left = linker[Right];
            if (Left >= n) // 匹配到虚拟结点，说明本region已经没有更多程序了
                break;
            programs.push_back(Left); // 保存程序
            region_number[Left] = region; // 记录编号
            total -= G[Left][Right]; // 计算时间
        }
        reverse(programs.begin(), programs.end());
        int time = 0;
        for (int i = 0; i < programs.size(); ++i) // 每个程序开始时间
        {
            start[programs[i]] = time;
            time += runtime[programs[i]][region];
        }
    }
    printf("Average turnaround time = %.2f\n", (double)total / n);
    for (int p = 0; p < n; ++p)
        printf("Program %d runs in region %d from %d to %d\n", p + 1, region_number[p] + 1, start[p], start[p] + runtime[p][region_number[p]]);
    printf("\n");
}

int main()
{
    int kase = 0;
    while (scanf("%d%d", &m, &n) != EOF && n && m)
    {
        memset(G, 0, sizeof(G));
        memset(runtime, 0, sizeof(runtime));
        nx = n * m,  ny = n * m;
        int size[maxr];
        for (int region = 0; region < m; ++region)
            scanf("%d", &size[region]);
        for (int program = 0; program < n; ++program) // 计算程序P在区域R的运行时间
        {
            int s[10], t[10], k;
            scanf("%d", &k);
            for (int i = 0; i < k; ++i)
                scanf("%d%d", &s[i], &t[i]);
            for (int region = 0; region < m; region++)
            {
                int & time = runtime[program][region];
                time = inf;
                if (size[region] < s[0]) // 内存不足s[0]时无法运行
                    continue;
                for (int i = 0; i < k; ++i)
                {
                    if (i == k - 1 || size[region] < s[i + 1]) // 内存满足s[i] <= s < s[i+1]时运行时间为ti，内存至少为s[k]，运行时间为tk
                    {
                        time = t[i];
                        break;
                    }
                }
                // 连边X(program) -> Y(region,pos) 程序p在内存r的倒数第p个位置的运行时间
                for (int pos = 0; pos < n; ++pos)
                    G[program][region * n + pos] = -(pos + 1) * time; // 最小权值匹配，取相反数
            }
        }
        for (int i = n; i < n * m; ++i) // 虚拟结点补全二分图
            for (int j = 0; j < n * m; ++j)
                G[i][j] = 1;
        KM();
        printf("Case %d\n", ++kase);
        output();
    }
    return 0;
}

写法二：n（左，n个程序）——n*m（右，m块内存每块n个区域）

Status	Accepted
Time	43ms
Length	4720

#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <cmath>
#include <queue>
#include <vector>
using namespace std;
const int maxn = 1010;
const int maxp = 50 + 5;
const int maxr = 10 + 5;
const int inf = 0x3f3f3f3f;
int nx, ny, n, m;
int G[maxn][maxn], runtime[maxp][maxr];
int linker[maxn], lx[maxn], ly[maxn], slack[maxn];
bool visx[maxn], visy[maxn];

bool dfs(int x)
{
    visx[x] = true;
    for (int y = 0; y < ny; ++y)
    {
        if (visy[y]) // 注意这里需要判断x和y之间是否连边
            continue;
        int tmp = lx[x] + ly[y] - G[x][y];
        if (tmp == 0)
        {
            visy[y] = true;
            if (linker[y] == -1 || dfs(linker[y]))
            {
                linker[y] = x; // 匹配
                return true;
            }
        }
        else if (slack[y] > tmp)
            slack[y] = tmp;
    }
    return false;
}

void KM() // 二分图最大权值匹配
{
    memset(linker, -1, sizeof(linker));
    memset(ly, 0, sizeof(ly));
    for (int i = 0; i < nx; ++i)
    {
        lx[i] = -inf;
        for (int j = 0; j < ny; ++j)
        {
            if (G[i][j] > lx[i])
                lx[i] = G[i][j];
        }
    }
    for (int x = 0; x < nx; ++x)
    {
        for (int i = 0; i < ny; ++i)
            slack[i] = inf;
        while (true)
        {
            memset(visx, 0, sizeof(visx));
            memset(visy, 0, sizeof(visy));
            if (dfs(x))
                break;
            int d = inf;
            for (int i = 0; i < ny; ++i)
            {
                if (!visy[i] && d > slack[i])
                    d = slack[i];
            }
            for (int i = 0; i < nx; ++i)
            {
                if (visx[i])
                    lx[i] -= d;
            }
            for (int i = 0; i < ny; ++i)
            {
                if (visy[i])
                    ly[i] += d;
                else
                    slack[i] -= d;
            }
        }
    }
}

void output() // 具体方案
{
    int start[maxp], region_number[maxp], total = 0; // 程序p开始时间，所在内存编号，总时间
//    for (int i = 0; i < nx; ++i)
//        total -= lx[i];
//    for (int j = 0; j < ny; ++j)
//        total -= ly[j];
    for (int i = 0; i < ny; ++i)
        if (linker[i] != -1)
            total -= G[linker[i]][i];
    for (int region = 0; region < m; region++) // 起始时刻、分配到得区域编号、总回转时间
    {
        vector <int> programs; // 本region执行的所有程序，按照逆序排列（因为是指“倒数”第pos个程序）
        for (int pos = 0; pos < n; pos++)
        {
            int Right = region * n + pos;
            int Left = linker[Right];
            if (linker[Right] == -1) // 未匹配到跳过
                continue;
            programs.push_back(Left); // 保存程序
            region_number[Left] = region; // 记录编号
        }
        reverse(programs.begin(), programs.end());
        int time = 0;
        for (int i = 0; i < programs.size(); ++i) // 每个程序开始时间
        {
            start[programs[i]] = time;
            time += runtime[programs[i]][region];
        }
    }
    printf("Average turnaround time = %.2f\n", (double)total / n);
    for (int p = 0; p < n; ++p)
        printf("Program %d runs in region %d from %d to %d\n", p + 1, region_number[p] + 1, start[p], start[p] + runtime[p][region_number[p]]);
    printf("\n");
}

int main()
{
    int kase = 0;
    while (scanf("%d%d", &m, &n) != EOF && n && m)
    {
        for (int i = 0; i < n*m; ++i)
            for (int j = 0; j < n*m; ++j)
                G[i][j] = -inf;
        memset(runtime, 0, sizeof(runtime));
        nx = n,  ny = n * m;
        int size[maxr];
        for (int region = 0; region < m; ++region)
            scanf("%d", &size[region]);
        for (int program = 0; program < n; ++program) // 计算程序P在区域R的运行时间
        {
            int s[10], t[10], k;
            scanf("%d", &k);
            for (int i = 0; i < k; ++i)
                scanf("%d%d", &s[i], &t[i]);
            for (int region = 0; region < m; region++)
            {
                int & time = runtime[program][region];
                time = inf;
                if (size[region] < s[0]) // 内存不足s[0]时无法运行
                    continue;
                for (int i = 0; i < k; ++i)
                {
                    if (i == k - 1 || size[region] < s[i + 1]) // 内存满足s[i] <= s < s[i+1]时运行时间为ti，内存至少为s[k]，运行时间为tk
                    {
                        time = t[i];
                        break;
                    }
                }
                // 连边X(program) -> Y(region,pos) 程序p在内存r的倒数第p个位置的运行时间
                for (int pos = 0; pos < n; ++pos)
                    G[program][region * n + pos] = -(pos + 1) * time; // 最小权值匹配，取相反数
            }
        }
        KM();
        printf("Case %d\n", ++kase);
        output();
    }
    return 0;
}

一开始按照刘汝佳老师的代码写的，KM算法是用的kuangbin的板子，但是连样例都过不了，我拿两份代码测试样例，发现加的边以及权值都是一样的，就是有问题，弄得我还以为bin聚的板子有问题。实际上并不一样，因为刘老师的写法除了一个邻接矩阵W来保存边权信息外，还有一个vector邻接表，实际上在跑最大权值匹配的时候是用邻接表来遍历边的，而我只用了一个邻接矩阵来保存边权信息，然后这里就出现问题了，实际上有些点之间并没有连边，比如有些程序的最低内存需求大于某个内存块的内存，这时候这两点之间就没有边，再加上我初始化二维数组是清为0，所以导致匹配出错，无法得到正确结果，后面终于发现了这个错误，太不容易了。

这道题的关键还是在于建图转化为二分图匹配，还有一点就是输出分配方案。

The end.
2018-08-26 星期日