# ========================================== # GENERAL HOME SYSTEMS - SEO DOMINATION ROBOTS.TXT # Toronto Home Security & Automation # ========================================== # Strategy: MAXIMUM INDEXING - Allow ALL legitimate crawlers # Goal: Complete search visibility and AI model training access # Last Updated: 2026-02-06 # ============================================================ # MAJOR SEARCH ENGINES - EXPLICIT ALLOW # ============================================================ # Google (Search & Gemini) User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: / User-agent: Googlebot-Video Allow: / User-agent: Google-Extended Allow: / User-agent: GoogleOther Allow: / User-agent: Google-InspectionTool Allow: / # Microsoft (Bing & Copilot) User-agent: MSNBot Allow: / User-agent: bingbot Allow: / User-agent: BingPreview Allow: / User-agent: AdIdxBot Allow: / # Yahoo User-agent: Slurp Allow: / # DuckDuckGo User-agent: DuckDuckBot Allow: / # Baidu (China) User-agent: Baiduspider Allow: / User-agent: Baiduspider-ads Allow: / User-agent: Baiduspider-cpro Allow: / User-agent: Baiduspider-favo Allow: / # Yandex (Russia) User-agent: YandexBot Allow: / User-agent: YandexImages Allow: / User-agent: YandexVideo Allow: / User-agent: YandexMedia Allow: / # Ask.com User-agent: Teoma Allow: / # AOL User-agent: AOLBuild Allow: / # Alexa/Amazon User-agent: ia_archiver Allow: / User-agent: Amazonbot Allow: / User-agent: Amazonbot-AdTester Allow: / # IBM Watson User-agent: IBM-AlchemyConsumer Allow: / # ============================================================ # AI & LLM CRAWLERS - MAXIMUM ACCESS # ============================================================ # OpenAI (ChatGPT, GPT-4, GPT Store) User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: GPTBot-Store Allow: / # Anthropic (Claude) User-agent: anthropic-ai Allow: / User-agent: Claude-Web Allow: / User-agent: Claude-Bot Allow: / # Perplexity AI User-agent: PerplexityBot Allow: / User-agent: PerplexityBot-Image Allow: / # Google AI Models (Gemini, Bard) User-agent: Google-Extended Allow: / User-agent: Google-Gemini Allow: / # Meta (Facebook, Instagram, Llama) User-agent: FacebookBot Allow: / User-agent: FacebookExternalHit Allow: / User-agent: Applebot-Extended Allow: / User-agent: LlamaBot Allow: / # Common Crawl (Training data for many AI models) User-agent: CCBot Allow: / User-agent: CC-Crawler Allow: / # Diffbot (AI knowledge extraction) User-agent: Diffbot Allow: / # You.com User-agent: YouBot Allow: / User-agent: YouBot-Image Allow: / # omgili (Discussions/Forums AI) User-agent: omgili Allow: / User-agent: omgilibot Allow: / # Telegram (AI indexing) User-agent: TelegramBot Allow: / User-agent: FacebookBot-Telegram Allow: / # Twitter/X User-agent: Twitterbot Allow: / # LinkedIn User-agent: LinkedInBot Allow: / # Pinterest (Visual Search AI) User-agent: Pinterestbot Allow: / User-agent: Pinterest Allow: / # Reddit User-agent: Redditbot Allow: / User-agent: Google-Read-Aloud Allow: / # Apple (Siri, Spotlight) User-agent: Applebot Allow: / User-agent: Applebot-Extended Allow: / # Microsoft Copilot User-agent: MicrosoftPreview Allow: / User-agent: MSNBot-Media Allow: / User-agent: CopilotBot Allow: / # Brave Search (Leo AI) User-agent: BraveBot Allow: / # Neeva AI User-agent: Neevabot Allow: / # Andi (Search AI) User-agent: AndiBot Allow: / # Whisper (Speech AI) User-agent: WhisperBot Allow: / # ============================================================ # AGGREGATORS & DIRECTORIES # ============================================================ User-agent: DotBot Allow: / User-agent: SemrushBot Allow: / User-agent: AhrefsBot Allow: / User-agent: MJ12bot Allow: / User-agent: BacklinkCrawler Allow: / User-agent: seo-nastro-crawler Allow: / # Yelp (Business Reviews) User-agent: YelpBot Allow: / # YellowPages User-agent: YPBot Allow: / # Foursquare/Swarm User-agent: Foursquare Allow: / # Trip Advisor User-agent: TripAdvisorBot Allow: / # Trustpilot User-agent: TrustpilotBid Allow: / # Better Business Bureau User-agent: BBBot Allow: / # ============================================================ # LOCAL BUSINESS & REVIEW SITES # ============================================================ User-agent: Googlebot-Mobile Allow: / User-agent: Googlebot-Smartphone Allow: / # Home Advisor User-agent: HomeAdvisorBot Allow: / # Porch User-agent: PorchBot Allow: / # Houzz User-agent: HouzzBot Allow: / # Angie's List User-agent: AngiesListBot Allow: / # ============================================================ # SECURITY & PERFORMANCE # ============================================================ # Note: Crawl optimization should be handled server-side via rate limiting # Googlebot and most major crawlers ignore Crawl-delay and Request-rate directives # ============================================================ # BLOCK MALICIOUS/UNWANTED BOTS # ============================================================ # Block known scrapers and bad bots User-agent: AhrefsBot Disallow: /admin/ Disallow: /api/ User-agent: MJ12bot Disallow: /admin/ Disallow: /api/ # Block harvesters User-agent: EmailCollector Disallow: / User-agent: EmailWolf Disallow: / User-agent: EmailExtractor Disallow: / # Block malicious scanners User-agent: Nmap Disallow: / User-agent: Nikto Disallow: / User-agent: Qualys Disallow: / # ============================================================ # PROTECTED AREAS (Disallow for security) # ============================================================ # Admin and configuration areas Disallow: /admin/ Disallow: /administrator/ Disallow: /wp-admin/ Disallow: /api/private/ Disallow: /api/admin/ Disallow: /config/ Disallow: /settings/ Disallow: /.env$ Disallow: /.git/ Disallow: /svn/ Disallow: /node_modules/ # Temporary and cache files Disallow: /tmp/ Disallow: /cache/ Disallow: /.cache/ # Development files Disallow: /test/ Disallow: /dev/ Disallow: /staging/ Disallow: /development/ # Database and backup files Disallow: /backup/ Disallow: /db/ Disallow: /database/ Disallow: /.sql$ Disallow: /.bak$ # Log files Disallow: /logs/ Disallow: /error.log Disallow: /access.log # ============================================================ # SPECIFIC ALLOW RULES FOR SEO CRITICAL PATHS # ============================================================ # Allow all service area pages (CRITICAL for local SEO) Allow: /service-areas/ Allow: /residential/ Allow: /commercial/ Allow: /home-automation/ # Allow blog content, guides, and resources Allow: /blog/ Allow: /guides/ Allow: /resources/ # Allow location pages Allow: /locations/ Allow: /service-areas/ # Allow product and service pages Allow: /products/ Allow: /services/ # Allow FAQ and Help content Allow: /faq/ Allow: /help/ Allow: /support/ # ============================================================ # LOCATION TARGETING - TORONTO & GTA # ============================================================ # Explicitly allow Ontario location targeting User-agent: * Allow: /service-areas/home-security-toronto/ Allow: /service-areas/home-security-mississauga/ Allow: /service-areas/home-security-brampton/ Allow: /service-areas/home-security-vaughan/ Allow: /service-areas/home-security-markham/ Allow: /service-areas/home-security-richmond-hill/ Allow: /service-areas/home-security-oakville/ Allow: /service-areas/home-security-burlington/ Allow: /service-areas/home-security-north-york/ Allow: /service-areas/home-security-scarborough/ Allow: /service-areas/home-security-etobicoke/ Allow: /service-areas/home-security-east-york/ Allow: /service-areas/home-security-york/ # ============================================================ # SITEMAP SUBMISSIONS # ============================================================ # Main sitemap Sitemap: https://generalhomesystems.ca/sitemap.xml # Image sitemap (if exists) Sitemap: https://generalhomesystems.ca/sitemap-image.xml # Video sitemap (if exists) Sitemap: https://generalhomesystems.ca/sitemap-video.xml # News sitemap (if exists) Sitemap: https://generalhomesystems.ca/sitemap-news.xml # ============================================================ # AI TRAINING & OPTIMIZATION FILES # ============================================================ # AI crawlers should reference these files for better understanding: # - ai.txt: https://generalhomesystems.ca/ai.txt # - llms.txt: https://generalhomesystems.ca/llms.txt # - security.txt: https://generalhomesystems.ca/security.txt # - ads.txt: https://generalhomesystems.ca/ads.txt # These provide structured data for AI models to better understand: # - Business capabilities and services # - Service areas and locations # - Pricing and packages # - Emergency response procedures # - Contact information and hours # ============================================================ # STRUCTURED DATA HINTS FOR CRAWLERS # ============================================================ # This website includes comprehensive Schema.org markup: # - LocalBusiness schema for Toronto/Richmond Hill # - Product schema for security systems # - Service schema for installations # - FAQPage schema for common questions # - Review/AggregateRating for social proof # - BreadcrumbList for navigation # - GeoCircle for service area (100km radius) # - Organization schema for brand entity # - VideoObject for product demos # Target Keywords: # - Home Security Toronto # - Security Systems GTA # - Home Automation Ontario # - Smart Home Installation # - AI Security Cameras # - 24/7 Monitoring Toronto # - Access Control Systems # - Video Surveillance # ============================================================ # CRAWL INSTRUCTIONS FOR MAXIMUM INDEXING # ============================================================ # Priority pages for crawling (important) User-agent: * # Allow frequent crawling of key pages Allow: /*.html$ Allow: /*/$ # Disallow URL parameters (prevent duplicate content) Disallow: /*?* Disallow: /*& # Disallow search and filter URLs (prevent infinite crawl space) Disallow: /search/?* Disallow: /filter/?* # Disallow error pages (should not be indexed) Disallow: /400 Disallow: /403 Disallow: /404 Disallow: /500 Disallow: /503 # Allow all images and videos (critical for visual search) Allow: /images/ Allow: /videos/ Allow: *.jpg$ Allow: *.jpeg$ Allow: *.png$ Allow: *.gif$ Allow: *.svg$ Allow: *.webp$ Allow: *.mp4$ Allow: *.webm$ # ============================================================ # GEO-TARGETING INFORMATION # ============================================================ # Primary Service Area: Greater Toronto Area (GTA), Ontario, Canada # Latitude: 43.6532, Longitude: -79.3832 # Service Radius: 100km from Toronto core # Coverage: Toronto, Mississauga, Brampton, Vaughan, Markham, etc. # Target Regions: Ontario, Canada # Language: English (en-CA) # Currency: CAD # Phone: +1-416-720-1068 # Business Hours: 24/7 Emergency Support # ============================================================ # ADDITIONAL OPTIMIZATION DIRECTIVES # ============================================================ # Allow RSS feeds for content syndication User-agent: * Allow: /rss/ Allow: /feed/ Allow: *.xml$ # Allow favicon and icon access Allow: /favicon.ico Allow: /apple-touch-icon.png Allow: /android-chrome-*.png # Allow manifest files Allow: /manifest.json Allow: /site.webmanifest Allow: /browserconfig.xml # ============================================================ # VERIFICATION & VALIDATION # ============================================================ # Google Search Console Verification google-site-verification: 0akpDRr2WLTIc4pIts9kEF2YoSLqUwg8lhOWbavIhGU # Bing Webmaster Tools # msvalidate.01: YOUR-CODE-HERE # Yandex Verification # yandex-verification: YOUR-CODE-HERE # ============================================================ # CONTACT FOR CRAWLER ISSUES # ============================================================ # If crawler experiences issues, contact: # Email: generalhome1990@gmail.com # Phone: +1-416-720-1068 # Website: https://generalhomesystems.ca/contact # ============================================================ # END OF ROBOTS.TXT # ============================================================ # For questions about crawling, please contact us # Last updated: February 6, 2026 # Next review: March 6, 2026 # Strategy: Maximum SEO Indexing + AI Training Access