Is There a Way to Keep Personal Website off Google, (Search Engines), Etc. … ?
Author: Cf GuyHi All,
I've just designed my first person website using Website X5 Pro. It is a personal site that showcases a variety of home improvement projects I need to have taken care of at my home - basically where various contactors can go to see the work that I need to have done. I am not looking for any type of "publicity" for the website, and would much prefer to just keep in a "private, personal website" without having it listed in any way on search engines such as "Google", etc. I realize that this is the opposite of what many website developers are looking for, as there are whole books and YouTube tutorials on how to "get your website noticed" or "move it up in the search rankings". This is exactly the opposite of what I am looking for - I do not want any "publicity" for my site and would prefer to just be able to forward the website link to the contractors whom I have pre-screened, so that they may look at what work needs to be done.
I have already attempted to use "access management" features built into Website x5 Pro to password protect the page with my personal contact information (I understand that Google "bots" and "scrapers" are not able to collect data from password protected areas of websites). The pages showcasing the projects themselves I am not implementing "access management/password" features as none of them would identify me, personally, and it is not as important to me to keep the data on these pages "private".
Is there anything else I may do, either before or right after publishing my website that would somehow send a "signal" to companies such as "Google" that I would prefer my website remain private and "unlisted"?
Many thanks for any help or suggestions anyone has to offer about this topic.
Thanks,
-Cfguy
Upload a file with the name "robots.txt" to the main directory of the web space, where index.html is, with this content (adapt the file name).
----- Code for robots.txt -----
# für alle Robots drei bestimmte Seiten im Stammverzeichnis sperren
User-agent: *
Disallow: /geheim.html
Disallow: /streng-geheim.html
Disallow: /aeusserst-geheim.html
-----
If not all search engines are to be excluded, but only Google, then change this line as follows.
-----
User-agent: Googlebot
-----
The robots.txt file is only a hint for the search engines and only good search engines will follow it.
También puedes incluir esta metaetiqueta en tu sitio. Pero, para que el bot de google lea esta metaetiqueta y no indexe tu sitio, no tienes que bloquearlo en robots.txt
<meta name="googlebot" content="noindex, nofollow">
Prueba la que mejor se ajuste a tus necesidades.
La Metaetiqueta la tienes que colocar en:
El PASO 1 - Estadísticas y Código - Código - Antes del cierre de la etiqueta head.
Para impedir que la mayoría de los buscadores indexen una página de tu sitio, coloca la siguiente etiqueta:
<meta name="robots" content="noindex">
Ahora que si tienes acceso al servidor puedes bloquear la indexación.
1) EN cpanel busca las herramientas avanzadas. y abre la opción "Indexes".
2) Elige la carpeta "public_html" (que es donde se encuentran todos los archivos de nuestro sitio.) y da clic en la opción "EDIT".
3) Solo es seleccionar la opción "No Indexing" y guardar cambios.
Eso debería de ser suficiente para que ningún bot indexe tu sitio.
Saludos.
X5 has a function for that.
In 2 - settings you can specify which files or paths to exclude for the crawlers.
See more about the robots.txt here: https://guide.websitex5.com/en/support/solutions/articles/44000785436?utm_source=software&utm_medium=Pro_2021.3
NO !
è un grave errore usare il file ROBOTS.TXT per bloccare le pagine !!!
Bisogna usare il META TAG NOINDEX !!!
<meta name="robot" content="noindex">
E' spiegato bene sulla guida di Google
https://developers.google.com/search/docs/crawling-indexing/robots/intro
======
Oltre al TAG META , togli la spunta su inserisci la pagina nella Sitemap
===
io per le pagine private uso la protezione lato Host, con password su htaccess
>>> Note on robots.txt
-----
Incorrect use of robots.txt can have undesirable consequences
A particularly common mistake that is made in conjunction with the robots.txt file is the following: A page is already indexed by Google, but should be removed from the index.Instead of marking the page with "noindex", it is blocked in the robots.txt with "disallow".As a result, the page remains in the index, but Google no longer displays a description in the snippet, but only an indication that the page is blocked by robots.txt:
-----
Falscher Einsatz der robots.txt kann ungewünschte Folgen haben
Ein besonders häufiger Fehler, der im Zusammenspiel mit der robots.txt-Datei begangen wird, ist der folgende: Eine Seite ist bereits bei Google indexiert, soll aber aus dem Index entfernt werden. Anstatt nun die Seite per "noindex" zu kennzeichnen, wird sie in der robots.txt per "disallow" gesperrt. Das hat aber zur Folge, dass die Seite weiterhin im Index verbleibt, Google aber keine Description mehr im Snippet anzeigt, sondern lediglich einen Hinweis, dass die Seite per robots.txt gesperrt ist:
>> https://www.seo-suedwest.de/seo-wissen/tipps-und-tricks/35-onsite-optimierung/3392-noindex-oder-robots-txt-wann-ist-welches-instrument-das-richtige.html
-----