0%

首先需要安装guest addtions

主机端:

为客户机配置共享文件夹,略。

客户机:

列出可用的share folders

1
2
3
4
5
6
7
8
$ sudo VBoxControl sharedfolder list
Oracle VM VirtualBox Guest Additions Command Line Management Interface Version 6.0.8
(C) 2008-2019 Oracle Corporation
All rights reserved.

Shared Folder mappings (1):

01 - Downloads \[idRoot=0 writable auto-mount host-icase guest-icase mnt-pt=/media/host/\]

如果设定了自动挂装, virtualbox会自动挂载之,否则可以手动挂载:

1
$ sudo mount -t vboxsf Downloads /media/host

访问权限问题,将当前用户加入vboxsf组

1
$ sudo adduser $USER vboxsf

然后注销重新登录,或者使用

1
$ newgrp vboxsf

没有GUI的debian buster安装virtualbox guest additions

主机端:

启动客户机,点击菜单Devices->Insert Guest Additons Image…

客户机端:

安装内核模块build依赖:

1
$ apt-get install -y dkms build-essential linux-headers-$(uname -r)

挂载cdrom,安装客户附加组件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ sudo mount /dev/cdrom /media/cdrom
$ cd /media/cdrom
$ sudo su
# ./VBoxLinuxAdditions.run
Verifying archive integrity... All good.
Uncompressing VirtualBox 6.0.8 Guest Additions for Linux........
VirtualBox Guest Additions installer
Removing installed version 6.0.8 of VirtualBox Guest Additions...
Copying additional installer modules ...
Installing additional modules ...
VirtualBox Guest Additions: Starting.
VirtualBox Guest Additions: Building the VirtualBox Guest Additions kernel
modules. This may take a while.
VirtualBox Guest Additions: To build modules for other installed kernels, run
VirtualBox Guest Additions: /sbin/rcvboxadd quicksetup <version>
VirtualBox Guest Additions: or
VirtualBox Guest Additions: /sbin/rcvboxadd quicksetup all
VirtualBox Guest Additions: Building the modules for kernel 4.19.0-5-amd64.
update-initramfs: Generating /boot/initrd.img-4.19.0-5-amd64
VirtualBox Guest Additions: Running kernel modules will not be replaced until
the system is restarted

reboot客户机

1
# reboot

校验安装:

1
2
$ lsmod grep vboxguest
vboxguest 348160 2 vboxsf

安装成功

References:
[1]How to install VirtualBox Guest Additions on a GUI-less Ubuntu server host

一台很老旧的服务器terminal不断吐出一些错误提示:

1
2
3
4
5
6
7
\[ 1250.944486\] mce: \[Hardware Error\]: Machine check events logged
\[ 1250.944493\] \[Hardware Error\]: Corrected error, no action required.
\[ 1250.948666\] \[Hardware Error\]: CPU:24 (10:9:1) MC4_STATUS\[OverCEMiscV-AddrVCECC\]: 0xdc0a400079080a13
\[ 1250.952631\] \[Hardware Error\]: Error Addr: 0x00000004abffce80
\[ 1250.952633\] \[Hardware Error\]: MC4 Error (node 6): DRAM ECC error detected on the NB.
\[ 1250.952654\] EDAC MC6: 1 CE on mc#6csrow#3channel#0 (csrow:3 channel:0 page:0x4abffc offset:0xe80 grain:0 syndrome:0x7914)
\[ 1250.952656\] \[Hardware Error\]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: RES (no timeout)

可以看到是内存出现了错误,不过错误被纠正了,但内存显然是出现故障了。

先看看系统cpu节点信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 48 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 4
NUMA node(s): 8
Vendor ID: AuthenticAMD
CPU family: 16
Model: 9
Model name: AMD Opteron(tm) Processor 6128
Stepping: 1
CPU MHz: 800.000
CPU max MHz: 2000.0000
CPU min MHz: 800.0000
BogoMIPS: 4000.04
Virtualization: AMD-V
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 5118K
NUMA node0 CPU(s): 0-3
NUMA node1 CPU(s): 4-7
NUMA node2 CPU(s): 8-11
NUMA node3 CPU(s): 12-15
NUMA node4 CPU(s): 16-19
NUMA node5 CPU(s): 20-23
NUMA node6 CPU(s): 24-27
NUMA node7 CPU(s): 28-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save pausefilter

共有四个socket,四颗cpu,每颗CPU有八个核心,总共是32核心,对于NUMA结构的机器,一般来讲每颗CPU应该至少有一个本地的内存控制器

安装edac-util,查看内存控制器信息

1
2
3
4
5
6
7
$ sudo apt install edac-utils
$ edac-util -vs
edac-util: EDAC drivers are loaded. 4 MCs detected:
mc0:F10h
mc2:F10h
mc4:F10h
mc6:F10h

可以看到有四个内存控制器,再查看每个内存控制器可能存在的错误

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

$ edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc0: csrow2: 0 Uncorrected Errors
mc0: csrow2: mc#0csrow#2channel#0: 0 Corrected Errors
mc0: csrow3: 0 Uncorrected Errors
mc0: csrow3: mc#0csrow#3channel#0: 0 Corrected Errors
mc2: 0 Uncorrected Errors with no DIMM info
mc2: 0 Corrected Errors with no DIMM info
mc2: csrow2: 0 Uncorrected Errors
mc2: csrow2: mc#2csrow#2channel#0: 0 Corrected Errors
mc2: csrow3: 0 Uncorrected Errors
mc2: csrow3: mc#2csrow#3channel#0: 0 Corrected Errors
mc4: 0 Uncorrected Errors with no DIMM info
mc4: 0 Corrected Errors with no DIMM info
mc4: csrow2: 0 Uncorrected Errors
mc4: csrow2: mc#4csrow#2channel#0: 0 Corrected Errors
mc4: csrow3: 0 Uncorrected Errors
mc4: csrow3: mc#4csrow#3channel#0: 0 Corrected Errors
mc6: 0 Uncorrected Errors with no DIMM info
mc6: 0 Corrected Errors with no DIMM info
mc6: csrow2: 0 Uncorrected Errors
mc6: csrow2: mc#6csrow#2channel#0: 2 Corrected Errors
mc6: csrow3: 0 Uncorrected Errors
mc6: csrow3: mc#6csrow#3channel#0: 4 Corrected Errors

或者这样查看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ grep \[0-9\] /sys/devices/system/edac/mc/mc*/csrow*/*ce_count
/sys/devices/system/edac/mc/mc0/csrow2/ce_count:0
/sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow3/ce_count:0
/sys/devices/system/edac/mc/mc0/csrow3/ch0_ce_count:0
/sys/devices/system/edac/mc/mc2/csrow2/ce_count:0
/sys/devices/system/edac/mc/mc2/csrow2/ch0_ce_count:0
/sys/devices/system/edac/mc/mc2/csrow3/ce_count:0
/sys/devices/system/edac/mc/mc2/csrow3/ch0_ce_count:0
/sys/devices/system/edac/mc/mc4/csrow2/ce_count:0
/sys/devices/system/edac/mc/mc4/csrow2/ch0_ce_count:0
/sys/devices/system/edac/mc/mc4/csrow3/ce_count:0
/sys/devices/system/edac/mc/mc4/csrow3/ch0_ce_count:0
/sys/devices/system/edac/mc/mc6/csrow2/ce_count:3
/sys/devices/system/edac/mc/mc6/csrow2/ch0_ce_count:3
/sys/devices/system/edac/mc/mc6/csrow3/ce_count:6
/sys/devices/system/edac/mc/mc6/csrow3/ch0_ce_count:6

可以看到出现错误的内存位于MC6,csrow2和csrow3,也就是问题出现在第四个(CPU的)内存控制器的0通道DIMM0内存这里。

References:
[1]How to identify defective DIMM from EDAC error on Linux
[2]内存条物理结构分析

以下示例中只有一台客户机,名称为”buster”,术语虚拟机等同于客户机

虚拟机列表

1
2
$ VBoxManage list vms
"buster" {6c18ec7b-5730-4e8e-a7d1-768dd0601be1}

无头模式启动虚拟机,自动进入后台模式

1
2
3
$ VBoxManage startvm buster --type headless
Waiting for VM "buster" to power on...
VM "buster" has been successfully started.

无头模式启动虚拟机,前台模式

1
2
3
4
$ VBoxHeadless -s buster
Oracle VM VirtualBox Headless Interface 6.0.8
(C) 2008-2019 Oracle Corporation
All rights reserved.

正在运行的客户机列表

1
2
$ VBoxManage list runningvms
"buster" {6c18ec7b-5730-4e8e-a7d1-768dd0601be1}

关闭客户机电源

1
2
$ VBoxManage controlvm buster poweroff
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%

使用VirtualBox图形界面也可以headless模式启动客户机,选择Start->Headless Start即可。

客户端执行ntpdate同步时间时发生错误:

1
ntpdate\[10877\]: no server suitable for synchronization found

debug模式运行ntpdate:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# ntpdate -d 192.168.0.3
ntpdate -d 192.168.0.3
5 May 14:52:58 ntpdate\[10877\]: ntpdate 4.2.6p5@1.2349-o Fri Jul 22 17:30:52 UTC 2016 (1)
transmit(192.168.0.3)
receive(192.168.0.3)
transmit(192.168.0.3)
receive(192.168.0.3)
transmit(192.168.0.3)
receive(192.168.0.3)
transmit(192.168.0.3)
receive(192.168.0.3)
192.168.0.3: Server dropped: Leap not in sync
server 192.168.0.3, port 123
stratum 3, precision -23, leap 11, trust 000
refid \[192.168.0.3\], delay 0.02617, dispersion 0.00005
transmitted 4, in filter 4
reference time: e07906ca.595bf596 Sun, May 5 2019 14:52:58.349
originate timestamp: e07906d0.bb8e930a Sun, May 5 2019 14:53:04.732
transmit timestamp: e07906d0.84432bb9 Sun, May 5 2019 14:53:04.516
filter delay: 0.02632 0.02621 0.02617 0.02617
0.00000 0.00000 0.00000 0.00000
filter offset: 0.215621 0.215522 0.215512 0.215605
0.000000 0.000000 0.000000 0.000000
delay 0.02617, dispersion 0.00005
offset 0.215512

5 May 14:53:04 ntpdate\[10877\]: no server suitable for synchronization found

提示错误Server dropped: Leap not in sync

在ntp服务器上与上游强制同步一次时间即可

1
2
3
# service ntp stop
# ntpdate -b pool.ntp.org
# service ntp start

然后再在客户端上重新进行时间同步

1
# ntpdate 192.168.0.3

服务器上突然出现很多acpi_pad/*进程,*为0,1,2,3,…,每个进程都占用100%CPU,导致系统极度缓慢

解决办法就是移除acpi_pad模块

1
# rmmod acpi_pad

并将其加入模块blacklist,/etc/modprobe.d/blacklist.conf文件中添加

1
blacklist acpi_pad

或者
在 grub 的 kernel 配置后面,添加 acpi_pad.disable=1 重启机器之后,开机就不会自动加载 acpi_pad 模块

注:早上跑到机房一看,空调自己关闭了,环境温度40几度,服务器、机柜一片报警,赶快打开两个空调降温,acpi_pad的问题是不是与环境温度过高有关呢?

因为存储归档文件的服务器硬件故障宕机,写入不成功导致postgresql无法归档。这时候可以临时修改配置文件

1
archive_command=''

然后reload postgresql

1
$ sudo service postgresql reload

这时服务器不会发送归档日志文件,但是服务器在继续累积产生的WAL日志文档,直到提供一个合适的归档命令,重新开始归档,这样不会丢失WAL日志文档。

如果提供如下归档命令

1
archive_command='/bin/true'

这样归档进程总是认为归档成功,但实际上并没有真正写归档文件,但服务器上的WAL会被删除掉(与参数wal_keep_segments有关),当硬件恢复或者换用其他硬件时,必须重新制作基础备份,因为WAL归档日志文件已经缺失了。

amd64架构linux kernel使用四级页表映射结构,但是存在两套terminology来描述这个事情

其一:

PML4(Page Map Leve4)(Level 4) -> PDP(Page Directory Pointer) (Level 3) -> PD(Page Directory) (Level 2) -> PTE(Page Table Entry)(Level 1)

另一:

PGD(Page Global Directory)(Level 4) -> PUD(Page Upper Directory)(Level 3) -> PMD(Page Middle Directory)(Level 2) -> PTE(Page Table Entry) (Level 1)

这两套术语其实表达的都是一样的东西,CR3寄存器保存Level 4表的物理地址,Level 4只有一页,4K大小,总共512个表项,每个表项可以寻址512GB内存,总共可以寻址256TB内存。
当前amd64处理器只是用了64位中的48位来寻址,因此使用四级页表时,虚拟地址的低48位为9+9+9+9+12映射结构,每个物理页框(Page Frame)为4K,4K地址对齐。
当使用2M的大页/巨页时,使用三级页表映射,虚拟地址的低48位为9+9+9+21映射结构,每个物理页框为2M,2M地址对齐。

参见下面两图:

4K页面地址转换
4K页面地址转换

2M 页面地址转换
2M 页面地址转换

为了支持更大的线性地址空间,以后的CPU会扩展到57位虚拟地址寻址。linux目前已经提供5级页表映射,如果启用5级页表,会在PGD和PUD之间插入新的一级页表,叫做P4D(Page 4 Directory),这样PGD会成为第五级页表,参见[1]。

References:
[1]Five-level page tables

CPL vs. DPL vs. RPL

To make this simpler, let’s first just consider CPL and DPL:

  • The CPL is your current privilege level.
  • The DPL is the privilege level of a segment. It defines the minimum1 privilege level required to access the segment.
  • Privilege levels range from 0-3; lower numbers are more privileged
  • So: To access a segment, CPL must be less than or equal to the DPL of the segment

RPL is a privilege level associated with a segment selector. A segment selector is just a 16-bit value that references a segment. Every memory access (implicitly2 or otherwise) uses a segment selector as part of the access.

When accessing a segment, there are actually two checks that must be performed. Access to the segment is only allowed if both of the following are true:

  • CPL <= DPL
  • RPL <= DPL

So even if CPL is sufficiently privileged to access a segment, the access will still be denied if the segment selector that references that segment is not sufficiently privileged.

The motivation behind RPL

What’s the purpose of this? Well, the reasoning is a bit dated now, but the Intel documentation offers a scenario that goes something like this:

  • Suppose the operating system provides a system call that accepts a logical address (segment selector + offset) from the caller and writes to that address
  • Normal applications run with a CPL of 3; system calls run with a CPL of 0
  • Let’s say some segment (we’ll call it X) has a DPL of 0

An application would ordinarily not be able to access the memory in segment X (because CPL > DPL). But depending on how the system call was implemented, an application might be able to invoke the system call with a parameter of an address within segment X. Then, because the system call is privileged, it would be able to write to segment X on behalf of the application. This could introduce a privilege escalation vulnerability into the operating system.

To mitigate this, the official recommendation is that when a privileged routine accepts a segment selector provided by unprivileged code, it should first set the RPL of the segment selector to match that of the unprivileged code3. This way, the operating system would not be able to make any accesses to that segment that the unprivileged caller would not already be able to make. This helps enforce the boundary between the operating system and applications.

Then and now

Segment protection was introduced with the 286, before paging existed in the x86 family of processors. Back then, segmentation was the only way to restrict access to kernel memory from a user-mode context. RPL provided a convenient way to enforce this restriction when passing pointers across different privilege levels.

Modern operating systems use paging to restrict access to memory, which removes the need for segmentation. Since we don’t need segmentation, we can use a flat memory model, which means segment registers CS, DS, SS, and ES all have a base of zero and extend through the entire address space. In fact, in 64-bit “long mode”, a flat memory model is enforced, regardless of the contents of those four segment registers. Segments are still used sometimes (for example, Windows uses FS and GS to point to the Thread Information Block and 0x23 and 0x33 to switch between 32- and 64-bit code, and Linux is similar), but you just don’t go passing segments around anymore. So RPL is mostly an unused leftover from older times.

RPL: Was it ever necessary?

You asked why it was a necessity to have both DPL and RPL. Even in the context of the 286, it wasn’t actually a necessity to have RPL. Considering the above scenario, a privileged procedure could always just retrieve the DPL of the provided segment via the LAR instruction, compare this to the privilege of the caller, and preemptively bail out if the caller’s privilege is insufficient to access the segment. However, setting the RPL, in my opinion, is a more elegant and simpler way of managing segment accesses across different privilege levels.

To learn more about privilege levels, check out Volume 3 of Intel’s software developer manuals, particularly the sections titled “Privilege Levels” and “Checking Caller Access Privileges”.

1 Technically, the DPL can have different meanings depending on what type of segment or gate is being accessed. For the sake of simplicity, everything I describe applies to data segments specifically. Check the Intel docs for more information
2 For example, the instruction pointer implicitly uses the segment selector stored in CS when fetching instructions; most types of data accesses implicitly use the segment selector stored in DS, etc.
3 See the ARPL instruction (16-bit/32-bit protected mode only)

mplayer的OSD(On-Screen Display)依赖于X11,so字幕无法显示:(,目前改投IINA了