Удалять процесс бесполезно, будет другой. Пока удалил всё перечисленное, ещё не появлялся. Я уже сталкивался где-то год назад с этим, но systemd-login не удалял. Остальное всё накатывается заново )
раньше оно юзало wget для вытягивания скрипта, даже wget переименовывал для этого))
Melkij, Спасибо за ответ, а скажи, как по логической репке можно гонять "вьюшки", VIEWS? Когда в публикацию их пытаешься добавить, они, разумеется, говорят
То есть чтобы делать-таки альтеры надо расчехлять pglogical? Делать для него отдельную подписку на все мои нужные 30 таблит, и всё это ради того, чтобы делать ALTER?
В общем стало ещё хуже - оставшиеся 2 реплики не упали в 7 утра мск (0 по ест). Я вообще перестал что-то понимать. Если рестартнуть рабочую репку, то
2019-04-14 02:51:54.246 EST [62912] LOG: 00000: received fast shutdown request
2019-04-14 02:51:54.246 EST [62912] LOCATION: pmdie, postmaster.c:2659
2019-04-14 02:51:54.246 EST [62912] LOG: 00000: aborting any active transactions
2019-04-14 02:51:54.246 EST [62912] LOCATION: pmdie, postmaster.c:2685
2019-04-14 02:51:54.246 EST [62982] FATAL: 57P01: terminating walreceiver process due to administrator command
2019-04-14 02:51:54.246 EST [62982] LOCATION: ProcessWalRcvInterrupts, walreceiver.c:167
2019-04-14 02:51:54.246 EST [62983] postgres@postgres FATAL: 57P01: terminating connection due to administrator command
2019-04-14 02:51:54.246 EST [62983] postgres@postgres LOCATION: ProcessInterrupts, postgres.c:2915
2019-04-14 02:51:54.246 EST [62984] postgres@postgres FATAL: 57P01: terminating connection due to administrator command
2019-04-14 02:51:54.246 EST [62984] postgres@postgres LOCATION: ProcessInterrupts, postgres.c:2915
2019-04-14 02:51:55.704 EST [62914] LOG: 00000: shutting down
2019-04-14 02:51:55.704 EST [62914] LOCATION: ShutdownXLOG, xlog.c:8038
2019-04-14 02:51:55.712 EST [62914] LOG: 00000: database system is shut down
2019-04-14 02:51:55.712 EST [62914] LOCATION: ShutdownXLOG, xlog.c:8073
2019-04-14 02:51:56.483 EST [79125] LOG: 00000: database system was shut down in recovery at 2019-04-14 02:51:55 EST
2019-04-14 02:51:56.483 EST [79125] LOCATION: StartupXLOG, xlog.c:6012
2019-04-14 02:51:56.483 EST [79125] LOG: 00000: entering standby mode
2019-04-14 02:51:56.483 EST [79125] LOCATION: StartupXLOG, xlog.c:6087
2019-04-14 02:51:56.587 EST [79125] LOG: 00000: redo starts at 1BFB/4A2951C8
2019-04-14 02:51:56.587 EST [79125] LOCATION: StartupXLOG, xlog.c:6785
2019-04-14 02:51:56.818 EST [79128] [unknown]@[unknown] LOG: 08P01: incomplete startup packet
2019-04-14 02:51:56.818 EST [79128] [unknown]@[unknown] LOCATION: ProcessStartupPacket, postmaster.c:1914
2019-04-14 02:51:57.323 EST [79133] postgres@postgres FATAL: 57P03: the database system is starting up
2019-04-14 02:51:57.323 EST [79133] postgres@postgres LOCATION: ProcessStartupPacket, postmaster.c:2204
2019-04-14 02:51:57.465 EST [79125] LOG: 00000: consistent recovery state reached at 1BFB/6E9C7840
2019-04-14 02:51:57.465 EST [79125] LOCATION: CheckRecoveryConsistency, xlog.c:7588
2019-04-14 02:51:57.465 EST [79125] LOG: 00000: invalid record length at 1BFB/6E9C7840
2019-04-14 02:51:57.465 EST [79125] LOCATION: ReadRecord, xlog.c:4012
2019-04-14 02:51:57.465 EST [79124] LOG: 00000: database system is ready to accept read only connections
2019-04-14 02:51:57.465 EST [79124] LOCATION: sigusr1_handler, postmaster.c:4991
2019-04-14 02:51:57.495 EST [79135] LOG: 00000: started streaming WAL from primary at 1BFB/6E000000 on timeline 1
2019-04-14 02:51:57.495 EST [79135] LOCATION: WalReceiverMain, walreceiver.c:363
всё, репка встала
-2:~# ps ax|grep wal
79135 ? Ss 0:08 postgres: 9.5/main: wal receiver process streaming 1BFB/94130158
Может ли что угодно среди других репок хоть каким-то образом влиять на эти падения? Вопрос бредовый какой-то....
%h это на реплике добавить? добавил, пока что на реплике дохлой вот что
2019-04-13 12:47:18.369 EST [32707] LOG: database system was shut down in recovery at 2019-04-13 12:47:16 EST
2019-04-13 12:47:18.369 EST [32707] LOG: entering standby mode
[local] 2019-04-13 12:47:18.424 EST [32708] [unknown]@[unknown] LOG: incomplete startup packet
2019-04-13 12:47:18.427 EST [32707] LOG: redo starts at 1BCA/B39678F0
2019-04-13 12:47:18.908 EST [32707] LOG: consistent recovery state reached at 1BCA/C70768B8
2019-04-13 12:47:18.908 EST [32706] LOG: database system is ready to accept read only connections
2019-04-13 12:47:19.618 EST [32718] LOG: started streaming WAL from primary at 1BCA/DA000000 on timeline 1
2019-04-13 12:47:19.671 EST [32718] FATAL: terminating walreceiver process due to administrator command
Кто может даже в теории прибивать? И куда дальше ковырять?
2019-04-13 15:22:54 - такого времени ещё нет, это же EST, сейчас там 12 только наступило
Если время 15-7, то это 8, есть рядом такая запись возможно любопытная.
И таких записей - ну довольно прилично
2019-04-13 10:47:38.334 EST [10121] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 10:51:06.645 EST [11749] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 10:54:47.019 EST [12950] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 10:57:10.068 EST [13945] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 10:59:34.017 EST [14668] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:02:54.328 EST [16177] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:13:34.900 EST [20508] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:28:13.727 EST [26485] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:30:11.725 EST [27348] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:34:26.986 EST [28815] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:35:19.129 EST [29301] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:41:21.386 EST [31630] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:42:14.747 EST [31888] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:49:20.204 EST [34554] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:51:48.013 EST [35694] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 11:54:47.104 EST [36703] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 12:05:33.801 EST [41517] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 12:08:27.831 EST [42435] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 12:08:44.030 EST [42502] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 08:21:09.926 EST [50291] postgres@[unknown] ERROR: 58P01: requested WAL segment 0000000100001B93000000B7 has already been removed
2019-04-13 08:21:09.926 EST [50291] postgres@[unknown] LOCATION: XLogRead, walsender.c:2148
но вокруг неё такая же унылая фигня
2019-04-13 08:21:05.204 EST [50231] tracker@aaa LOG: 08006: could not receive data from client: Connection reset by peer
pg_xlogdump я и пробовал на мастере
а явно как указать? тут же не путь до файла вроде.....1BCA/DA000000
нахожусь в каталоге с бинлогами
/storage/pg_xlog# /usr/lib/postgresql/9.5/bin/pg_xlogdump --start=1BCA/DA0E0258
pg_xlogdump: FATAL: could not find a valid record after 1BCA/DA0E0258
2019-01-29 11:11:11.711 UTC [74015] postgres@aaa LOG: logical decoding found consistent point at 2E/8D6C7C0
2019-01-29 11:11:11.711 UTC [74015] postgres@aaa DETAIL: There are no running transactions.
2019-01-29 11:11:11.711 UTC [74015] postgres@aaa LOG: exported logical decoding snapshot: "000F0F26-1" with 0 transaction IDs
2019-01-29 11:11:11.721 UTC [74016] postgres@aaa LOG: starting logical decoding for slot "pgl_aaa_provider_ucad7ded"
2019-01-29 11:11:11.721 UTC [74016] postgres@aaa DETAIL: streaming transactions committing after 2E/8D6C7F8, reading WAL from 2E/8D6C7C0
2019-01-29 11:11:11.722 UTC [74016] postgres@aaa LOG: logical decoding found consistent point at 2E/8D6C7C0
2019-01-29 11:11:11.722 UTC [74016] postgres@aaa DETAIL: There are no running transactions.
slave
2019-01-29 11:14:04.002 UTC [74490] [unknown]@template1 LOG: manager worker [74490] at slot 2 generation 47 detaching cleanly
2019-01-29 11:14:05.004 UTC [74491] [unknown]@postgres LOG: manager worker [74491] at slot 2 generation 48 detaching cleanly