子序列问题

最长公共子序列(LCS)

公共子序列即两个字符串中共有的子序列，现在需要寻找两个字符串的最长公共子序列

思路

本质上是选与不选的问题，我们考虑最后一对字母，两两组合有四种情况

更一般化：

当前操作：考虑s[i]与t[j]选或不选
子问题：s的前i个字母和t的前j个字母的LCS长度
下一个子问题
- s的前i-1个字母和t的前j-1个字母的LCS长度（都选或都不选）
- s的前i-1个字母和t的前j个字母的LCS长度（选s[i]不选t[j]）
- s的前i个字母和t的前j-1个字母的LCS长度（选t[j]不选s[i]）

转移方程

dfs(i,j) = max(dfs(i-1, j-i), dfs(i-1, j), dfs(i,j-1)+(s[i]==t[j]))

但可以考虑两个问题：

s[i]==t[j]时，需要dfs(i-1, j-i), dfs(i-1, j)吗？

不需要，可以从子序列方向推出矛盾

s[i]!=t[j]时，需要dfs(i-1, j-1)吗？

不需要，dfs(i-1, j-1)的结果已经在dfs(i-1, j-i), dfs(i-1, j)里面了

故方程可以简化为 \[ f[i][j] = \begin{cases} f[i-1][j-1]+1 &&s[i]=t[j]\\ max(f[i-1][j], f[i][j-1]) &&s[i]\neq t[j] \end{cases} \]

code

朴素LCS：空间和时间都为\(O(n^2)\)

int dp[N][N]={};
void solve() {
  
  int n;
  cin >> n;
  vector<int> s(n+1, 0);
  vector<int> t(n+1, 0);
  for (int i = 1;i<=n;i++) cin >> s[i];
  for (int i = 1;i<=n;i++) cin >> t[i];
  for (int i = 1;i<=n;i++){
    for (int j = 1;j<=n;j++){
      if (s[i]==t[j]){
        dp[i][j] = dp[i-1][j-1]+1;
      }
      else{
        dp[i][j] = max(dp[i-1][j],dp[i][j-1]);
      }
    }
  }
  cout << dp[n][n];
}

一维数组优化：对于计算当前行与上一行，需要考虑左、上与左上三种状态

此时左上的状态会被之前的计算给覆盖掉，故需要开一个临时变量存储

int dp[N]={};
void solve() {
  
  int n;
  cin >> n;
  vector<int> s(n+1, 0);
  vector<int> t(n+1, 0);
  for (int i = 1;i<=n;i++) cin >> s[i];
  for (int i = 1;i<=n;i++) cin >> t[i];
  for (int i = 1;i<=n;i++){
    int pre = dp[0]; // 初始化为0
    for (int j = 1;j<=n;j++){
      int temp = dp[j+1]; // 记录下一个的左上状态
      if (s[i]==t[j]){
        dp[j] = pre+1;
      }
      else{
        dp[j] = max(dp[j],dp[j-1]);
      }
      pre = temp; // 将上一个状态还原
    }
  }
  cout << dp[n];

}

转化为LIS：

对于t序列的每个元素，映射为在s序列中的位置组成数组p

这时我们在t中选出一个子序列，且要保持相对顺序，若对应的p的子序列也是上升的，证明他们在s中也是按照相同先后顺序出现的

这说明选择的这些元素在s和t中都保持着相同的相对顺序，也就是公共子序列

故我们可以将求LCS的问题转换成求t在s的相对位置上的LIS的问题

例：

s = [ 5, 2, 8, 1, 6 ]，t = [ 2, 6, 1, 8 ]

则t在s的位置序列p = [2, 5, 4, 3]

LIS可以为[2, 4], 对应p[1], p[3]，即在s上的2，4位置元素与t上的1，3位置元素都为[2,1]，是一个公共子序列

void solve() {
  int n;
  cin >> n;
  vector<int> s(n+1, 0);
  vector<int> t(n+1, 0);
  vector<int> dp;
  map<int,int> cnt;
  for (int i = 1;i<=n;i++) {
    cin >> s[i];
    cnt[s[i]] = i; // 存储s数组中各数的位置
  }
  for (int i = 1;i<=n;i++) cin >> t[i];
  for (int i = 1;i<=n;i++){
    auto j = lower_bound(dp.begin(),dp.end(),cnt[t[i]]); // 寻找t数组中值对应在s数组中的位置
    // 求对应位置序列的LIS
    if (j==dp.end()){ 
      dp.push_back(cnt[t[i]]);
    }
    else{
      *j = cnt[t[i]];
    }
  }
  cout << dp.size();

}

最长上升子序列(LIS)

DP

考虑子序列问题，用子集型回溯的思路考虑：

选或不选：取最后一个数字作为序列尾，需要知道上一个数的下标
枚举选哪个：取最后一个数字，向前枚举，比较当前选的数和下一个要选的数

则我们有思路：枚举num[i]作为LIS末尾元素，然后枚举num[j]为LIS倒数第二个元素，

其中j<i且nums[j]<nums[i]

子问题：以nums[i]结尾的LIS长度
当前操作：枚举nums[j]
下一个子问题：以nums[j]结尾的LIS长度

#include <bits/stdc++.h>
#define endl '\n'
using namespace std;
typedef long long ll;
typedef pair<int,int> PII;
const int N = 5e3 + 10;
const int mod = 1e9 + 7;
void solve() {
  // 让dp[i]是以i结尾的数组成的上升子序列的长度
  // 考虑枚举选哪个，即比较当前选的数字和该数字之前的数字，选择更小的作为倒数第二个数
  /*
  则我们的子问题：num[i]结尾的LIS长度
  操作：枚举num[j]
  下一个子问题：以num[j]结尾的LIS长度
  将子问题求最大值+1即为LIS
  */
  int n;
  cin >> n;
  ll a[N]={};
  ll dp[N]={};
  for (int i = 1;i<=n;i++) cin >> a[i];
  for (int i = 1;i<=n;i++){
    dp[i]=1;
    for (int j = 1;j<i;j++){
      if (a[j]<a[i]) dp[i]=max(dp[i],dp[j]+1);
    }
  }
  ll maxn = -1;
  for (int i = 1;i<=n;i++){
    maxn = max(maxn,dp[i]);
  }
  cout << maxn;
  
}
int main()
{
  ios::sync_with_stdio(0); cin.tie(0); cout.tie(0);
  int tomorin=1;
  //cin >> tomorin;
  while (tomorin--) solve();
  return 0;
}

与LCS

nums的LIS等价于nums与排序去重后的nums的LCS

贪心+二分

通过交换状态与状态值来进行优化

设g[i]为长度为i+1的IS的末尾元素最小值

有以下结论：

g是严格递增的
一次只能更新一个位置
更新的位置是第一个>=nums[i]的数的下标，若nums[i]比g的最后一个还大，那就加到末尾，这里可以用二分查找

#include <bits/stdc++.h>
#define endl '\n'
using namespace std;
typedef long long ll;
typedef pair<int,int> PII;
const int N = 2e5 + 10;
const int mod = 1e9 + 7;
void solve() {
  /*
  进一步优化，交换状态与状态值
  让g[i]表示长度为i+1的上升子序列的末尾元素的最小值

  其中g是严格递增的
  且一次只能更新一次，更新的位置是第一个>=num[i]的数的下标

  故我们可以使用二分来查找第一个>=num[i]的数的下标j,若不存在则直接添加到g末尾，否则修改g[j]为num[i]
  */
  int n;
  cin >> n;
  vector<int> arr(n+1);
  for (int i = 1;i<=n;i++) cin >> arr[i];
  vector<int> g;
  for (int i = 1;i<=n;i++){
    auto j = lower_bound(g.begin(), g.end(),arr[i]); // 若不是严格递增，相等也可以，这里改成upperbound即可
    if (j==g.end()) g.push_back(arr[i]);
    else {
      *j = arr[i];
    }
  }
  cout << g.size();
}
int main()
{
  ios::sync_with_stdio(0); cin.tie(0); cout.tie(0);
  int tomorin=1;
  //cin >> tomorin;
  while (tomorin--) solve();
  return 0;
}

红魔咖啡馆

【动态规划】子序列问题